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THE  REVOLUTION  IN  GRAPHIC  TECHNOLOGY* 

By  Vincent  P.  Barabba 

Director,  Bureau  of  the  Census 

U.S.  Department  of  Commerce 

During  1975  and  1976,  I  participated  in  two  conferences  relative 
to  the  graphic  presentation  of  information.  The  first  was  the  Inter- 
national Symposium  on  Computer-Assisted  Cartography  (Auto-Carto  II)  in 
Reston,  Virginia.  The  second  conference  was  a  section  meeting  of  the 
American  Statistical  Association's  annual  meeting  in  Boston.  Then  in 
1978,  at  the  Conference  on  Social  Graphics,  I  gave  a  talk,  the  sub- 
stance of  which  was  drawn  from  those  two  presentations. 

My  thoughts  on  this  subject  have  not  changed  that  much  in  the 
last  5  years.   However,  though  the  substance  of  my  thoughts  has  not 
changed,   my  ability  to  demonstrate  the  viability  of  these  thoughts 
has  increased  dramatically.   So  this  morning  you  are  going  to  hear 
some  of  my  thoughts  revisited  but  with  a  significantly  different  form 
of  presentation. 


*This  paper  was  presented  at  the  annual  meeting  of  the  American 
Association  for  the  Advancement  of  Science  in  San  Francisco,  Calif., 
January  8,  1980.   The  author  thought  that  this  presentation  would  be 
more  meaningful  than  his  original  presentation,  as  indicated  in  the 
text. 

The  charts  accompanying  this  paper  were  originally  generated 
from  an  array  of  hardware  support  consisting  of  a  Ramtek  6200A 
Colorgraphics  computer  terminal  and  a  GE  Light  Valve  (a  projection 
system).   The  visuals  for  the  talk  were  stored  in  digital  form  on  a 
tape  cassette,  and  the  information  on  the  tape  was  read  and  translated 
into  graphic  forms  by  a  microprocessor  in  the  Ramtek  terminal.   As 
the  graphics  were  drawn  into  the  terminal  screen,  the  necessary 
signals  were  also  being  sent  from  the  terminal  to  the  GE  projection 
system,  which  displayed  them  on  a  standard  8  by  8  screen.   A  Xerox  6500 
color  graphics  printer  can  also  be  tied  in  for  making  color  copies  on 
regular  paper  or  color  transparencies  of  the  graphics  being  displayed 
on  the  terminal. 


In  the  Auto  Carto  II  presentation,  I  identified,  as  an  innova- 
tion, "a  fully  automated  and  standardized  graphical  presentation 
system,,"   By  revisiting  some  very  ancient  and  some  rather  current 
history  to  point  out  that  the  elements  of  such  a  system  were  not  new, 
my  contention  was  that  the  notion  of  putting  graphics,  standards,  and 
automation  together  was,  at  the  very  minimum,  perceived  as  new  by  those 
who  would  be  asked  to  adopt  such  a  system. 

In  the  presentation  at  the  American  Statistical  Association,  I 
described  a  plan  for  such  a  system  at  the  Census  Bureau  and  gave  some 
predictions  as  to  what  the  future  held  for  enhancing  the  beginning 
elements  of  the  system  that  was  in  place  at  the  Bureau.   At  the 
Conference  on  Social  Graphics,  I  combined  elements  of  both  presenta- 
tions, and  today  I  want  to  revisit  those  comments  because  I  can't  resist 
pointing  out  that  those  predictions  came  through  and  that  some  of 
those  enhancements  are  actually  in  the  marketplace  today.   In  fact,  I 
will  actually  use  some  of  these  enhancements  during  this  presentation. 

Well,  where  are  we  on  the  Adoption  Curve?   The  answer  is  —  No  one 
knows  I      And  that,  in  my  judgment,  is  a  very  important  problem  that  needs 
to  be  addressed  by  those  who  are  interested  in  encouraging  the  utili- 
zation of  the  graphic  form.   Therefore,  let  me  spend  some  time 
describing  what  I  mean  by  the  problem. 

First,  what  do  we  mean  by  the  Adoption  Curve?   The  Adoption 
Curve,  as  used  in  this  presentation,  will  follow  the  traditional 
accumulation  of  a  bell-shaped  curve  of  the  number  of  adopters  over 
time. 
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Why  is  it  so  important  to  understand  it? 
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This  chart  demonstrates  the  wide  range  of  adoption  periods  of 
educational  ideas .   The  red  curve  represents  the  percentage  of  adoption 
of  kindergarten  in  public  schools.   The  blue  curve  represents  the  per- 
centage of  adoption  of  driver  training  courses  in  selected  public 
schools.   The  green  curve  represents  the  percentage  of  school  super- 
intendents in  a  Pennsylvania  county  adopting  modern  math. 

Everett  Rogers,  now  a  professor  of  communications  at  Stanford, 
points  out,  "These  data  raise  the  question  of  why  modern  math  diffused 
so  much  more  rapidly  than  kindergartens  or  even  driver  training. 
Likely  reasons  include  (1)  the  aftermath  of  Sputnik  (which  caused  wide- 
spread dissatisfaction  with  U.S.  public  education  and  has  led  since 
1958  to  massive  federal  programs  of  financial  assistance  to  encourage 
educational  innovations);  (2)  the  active  promotion  of  modern  math  (and 


driver  training  to  a  lesser  degree)  by  a  prestigious  and  powerful 
change  agency  (thus,  modern  math  is  an  example  of  directed  change, 
whereas  kindergarten  illustrates  selective  contact  change);  and  (3)  the 
contemporary  value  placed  by  the  public  on  change  in  education,  which 
was  not  so  strong  prior  to  the  late  1950's. 

One  could  raise  many  questions  about  the  consequences  that  the 
adoption  rate,  as  reflected  by  the  curves,  had  on  the  individuals 
affected  (or  not  affected)  by  the  adoption  as  well  as  the  impact  the 
role  of  adoption  had  on  the  innovation  itself.   The  important  point  is, 
we  can  learn  much  from  understanding  the  adoption  process. 

Well,  a  review  of  the  literature  tells  us  that  not  only  has  there 
been  a  discipline  developed  in  understanding  educational  innovations, 
but  some  very  significant  work  has  been  accomplished  in  understanding 
the  impact  of  major  changes  in  health  care  delivery,  agriculture,  and 
business.  Some  of  the  specific  types  of  innovations  for  which  we  can 
find  considerable  empirical  information  about  their  adoption  are: 

.  Medical  drugs 

.  Family  planning  methods 

.  Agricultural  methods 

.  Communication  channels 
Now,  what  do  we  know  about  graphics?   Interestingly,  we  know  a 
little  bit  more  than  we  knew  just  a  few  years  agoi   First  we  can  be 
thankful  to  Jim  Beniger  and  Dorothy  Robyn  for  providing  us  with 
a  detailed  list  of  graphic  events  which  have  occurred  throughout 
history.   I  have  taken  the  liberty  of  illustrating  their  list  of 
major  statistical  graphic  events  and  it  is  quite  interesting. 


STATIST  I  CAL  GRAPHIC  EVENTS 


TEN  YEAR  PERIODS 

This  chart  is  a  simple  bar  chart  which  identifies  the  number  of 
geographical  events  which  occurred  in  each  10-year  period  from  1600  to 
1970.   The  graph  clearly  reinforces  Bill  Kruskal's  comments  at  the 
Auto-Carto  II  conference  that  "The  role  of  statistical  graphics  within 
statistics  generally .. .has  had  tremendous  ups  and  downs....   In  recent 
years. . .there  has  been  a  renaissance  of  concern  with  graphics  and 
some  of  our  best  statistical  minds  have  suggested  new  graphical 
approaches  of  great  interest." 

This  data  can,  of  course,  be  reformatted  so  that  we  can  illus- 
trate the  cumulative  number  of  major  events  which  have  occurred  since 
1600.   First  we  increase  the  scale  to  accommodate  the  total  number  of 


events . 
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Then  we  actually  accumulate  these  events. 
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I'm  sure  that  there  are  many  things  we  can  (and  many  we  cannot) 
understand  from  this  chart.   However,  it  is  clear  that  if  the  next  20 
years  produces  graphical  events  at  the  same  rate  that  the  previous  20 
years  have,  we  are  going  to  be  able  to  do  a  lot  more  than  we  can 

today--assuming,  of  course,  that  there  is  someone  out  there  who 

wants  to  use  these  capabilities. 

A  second  thing  that  we  know  that  we  did  not  know  a  few  years  ago 
is  something  about  how  graphics  have  been  used  in  the  publications  of 
statisticians.   For  this  we  can  thank  Stephen  Feinberg  and  his  col- 
leagues from  the  University  of  Minnesota.   What  they  did  was  try  to 
find  out  the  extent  to  which  statisticians  were  using  graphics  and 
whether  there  had  been  a  shift  from  using  graphics  for  "illustration 
(and  communication)  to  analysis  (and  explanation)." 

To  do  this,  they  examined  all  graphs  published  in  the  Journal  of 
the  American  Statistical  Association  (JASA)  and  Biometrika  during  six 
5-year  spans  beginning  with  1921-25  and  ending  with  1972-75.   The 
following  chart  illustrates  the  actual  space  taken  up  by  graphs  as  a 
percentage  of  the  total  space  in  those  two  publications.   The  graphs 
illustrated  in  these  bars  exclude  graphs  which  did  not  in  any  way 
involve  data. 
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The  color  segments  of  the  bars  illustrate  three  types  of  data 
graphs:   The  red  color  depicts  "...graphs  intended  to  display  data  and 
results  of  Monte  Carlo  studies,  scatter  plots  (even  those  with  an  ac- 
companying regression  line)." 

The  blue  color  dipicts  "...plots  and  graphs  with  elements  of  both 
data  display  and  analysis — e.g.,  charts  from  older  papers  involving 
primitive  forms  of  analysis;  and  graphs  of  posterior  distributions." 

The  green  color  dipicts  "Analytical  graphs — residual  plots, 
half  normal  and  other  probability  plots  where  conclusions  are  drawn 
directly  from  graph,  graphical  methods  of  performing  calculations, 
and  spectrum  estimates  from  time  series." 

As  Feinberg  states  in  his  paper,  "These  graphs  clearly  show  the 
decline  in  the  use  of  statistical  graphics  during  this  century,  at 
least  within  two  of  our  major  statistical  journals." 
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I  am  not  sure  whether  the  significance  of  this  decline  is 
emphasized  or  lost  when  again  we  adjust  the  scale  to  a  full  100 
percent.   On  the  one  hand  the  decline  is  less  obvious  to  the  eye,  but 
on  the  other  hand  it  becomes  even  more  clear  that  graphics  are  clearly 
not  dominating  the  space  in  these  two  publications.   I  find  both  the 
observations  all  the  more  interesting  when  we  recall  the  cumulative 
growth  curve  from  Beniger  and  Robyn's  work.   What  the  next  two  charts 
illustrate  are  the  results  of  classifying  the  historical  events  iden- 
tified by  Beniger  and  Robyn  using  the  classification  scheme  developed 
by  Feinberg. 
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First  we  show  the  simple  bar  chart  color  coded  with  the  classification 
scheme,  and  then  we  show  the  classification  scheme,  and  then--as  we  did 
before--we  accumulate  these  events  over  time. 
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Well,  that  is  about  all  I  was  able  to  find  relative  to  utiliza- 
tion of  graphics.   I  did  find  much  discussion  about  "What's  good  about 
my  graphics ," and  "what's  bad  about  someone  else's."   And  there  is  some 
work  coming  from  the  ASA ' s  newly  formed  Standing  Committee  on  Statis- 
tical Graphics,  as  well  as  from  the  Social  Graphics  Project  of  the 
Bureau  of  Social  Science  Research.   But  as  it  relates  to  the  utiliza- 
tion of  the  graphic  form,  we  really  do  not  know  very  much. 

In  essence,  we  have  been  focusing  on  the  graphics  and  not  on  the 
graphics  user.   It  seems  to  me  that  we,  as  a  community  interested  in 
graphics,  could  legitimately  be  asked:   "Are  we  doing  graphics  for 
graphics'  sake?"  or  "Are  we  trying  to  find  out  how  to  transfer 
knowledge  to  others  efficiently  and  effectively?"   If  our  purpose  is 
the  latter,  then  let  me  return  to  some  of  the  comments  I  made  in  the 
Auto-Carto  II  presentation,  where  I  referred  to  the  pioneering  work  of 
Everett  Rogers,  relating  to  the  innovation  process.   Rogers  defines 
an  innovation  this  way: 

"An  idea,  practice,  or  object  perceived  as  new  by  an  individual. 
It  matters  little,  so  far  as  human  behavior  is  concerned, 
whether  or  not  an  idea  is  'objectively'  new  as  measured  by  the 
lapse  of  time  since  its  first  use  or  discovery.   It  is  the  per- 
ceived or  subjective  newness  of  the  idea  for  the  individual  that 
determines  his  reaction  to  it.   If  the  idea  seems  new  to  the 

individual,  it  is  an  innovation." 

I  find  a  sense  of  direction  in  the  work  of  Rogers  as  he  outlines 
the  complexity  of  the  innovation-decision  process.   Before  you 
introduce  any  innovation  to  an  agency  or  corporation,  you  must  really 
understand  the  people  involved,  both  as  individuals  and  as  a  group. 
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To  understand  them  as  individuals,  we  need  to  know  several  things. 
What  is  their  ability  to  handle  sophisticated  tools?   Do  they  basi- 
cally understand  what's  involved?   How  secure  are  they  in  their 
positions?   What  are  their  values?   And,  can  they  influence  others? 

To  understand  how  the  members  interact,  we  need  to  know  how 
traditional  the  group  is.   Do  they  do  something  a  certain  way  because 
they  have  always  done  it  like  that,  or  are  they  willing  to  try  new 
ideas?  What  are  the  organization's  characteristics?   The  pressures 
developed  by  each  organization  must  be  handled  differently.   It  is 
equally  important  to  remember  that  the  acceptance  of  an  innovation 
does  not  ensure  its  use  by  the  organization.   Therefore,  the  first 
step  is  to  take  a  good  look  at  the  organizational  environment  from 
both  points  of  view. 

Rogers  cites  four  phases  in  the  innovation  adoption  process. 
These  are  the  knowledge,  persuasion,  decision,  and  confirmation 
stages . 

First,  in  the  knowledge  stage,  the  individual  is  exposed  to 
the  innovation's  existence,  and  he  gains  some  understanding  of  its 
functions.   The  individual's  attitude  toward  the  innovation  hinges  on 
whether  he  sees  the  innovation  as  a  solution  to  a  problem  or  a  threat 

to  his  own  continued  employment. 

Second,  during  the  persuasion  stage,  the  individual  forms  a 
favorable  or  unfavorable  opinion  about  the  concept.   The  individual 
becomes  more  psychologically  involved  as  his  knowledge  of  the  innova- 
tion increases.   Since  there  is  some  degree  of  risk  involved  in 
adopting  an  innovation,  he  may  seek  reinforcement  of  his  attitude  from 
his  peers. 
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During  the  third  phase,  the  decision  stage,  the  individual  under- 
takes activities  which  will  ultimately  lead  to  the  adoption  or  reject- 
tion  of  the  innovation.   Rogers  points  out  that  most  individuals  do 
not  accept  an  innovation  without  first  using  it  on  a  probationary 
basis  to  determine  its  utility.   Because  of  the  obvious  advantages  of 
a  trial  prior  to  adoption,  those  innovations  available  for  a  trial 
period  are  generally  adopted  more  rapidly.   But,  it  is  just  as  pos- 
sible that  the  trial  will  result  in  the  rejection  of  the  concept. 
Rogers  suggests  that  while  rejection  may  occur  at  any  time,  it  usually 
happens  at  this  point  in  the  adoption  process  because  the  results  of 
the  trial  are  misinterpreted. 

Based  on  empirical  evidence,  Rogers  and  others  suggest  there  is 
an  additional  stage  beyond  the  initial  decision  to  adopt  or  reject  the 
innovation.   This  fourth  stage  is  one  of  further  testing  by  direct 
application  and  confirmation.   The  individual  seeks  further  reinforce- 
ment of  his  attitude. 

I  would  like  to  emphasize  that  in  the  adoption  of  many 
innovations,  it  is  critical  to  ensure  that  the  ultimate  choice  not  be 
limited  only  to  the  nominal  head  or  to  relatively  few  leaders  of  an 
organization.   Those  participants,  in  any  position  of  influence,  as 
well  as  those  who  would  use  the  innovation,  should  be  alerted  to  the 
impending  decision  and  encouraged  to  express  their  opinions. 

An  important  ingredient  dictating  how  an  innovation  fares 
through  this  process  of  adoption  is  the  innovation  itself.   But  how 
the  concerned  individuals  perceive  an  innovation  also  affects  the 
adoption.   The  individual's  perception  of  the  innovation  centers  on 
five  characteristics:   (1)  the  ability  to  identify  the  relative 
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advantages  of  having  such  a  system  versus  not  having  it;  (2)  the 
ability  to  illustrate  the  manner  in  which  such  an  information  system 
is  compatible  to  present  organizational  procedures;  (3)  the  ability  of 
successful  and  qualified  people  from  many  areas  of  endeavor  to  quickly 
absorb  the  technical  jargon  of  a  vast  and  complex  tool;  (4)  the  need 
to  divide  the  concept  and  test  it  in  a  way  that  no  person  or  organiza- 
tion feels  threatened;  and  (5)  communicating  the  results  of  the  effects 
of  such  an  innovation. 


PERCEIVED 

AFFECTS  THE  RATE 
OF  ADOPT  I  ON  BY 

INCREASING 
IT 

SLOWING 
IT 

RELATIVE 
ADVANTAGE 

M 
l"l 

COMPATIBILITY 

lMl 
l"l 

COMPLEXITY 

i  i 

TRIALABILITY 

l"l 

OBSERVABILITY 

l"l 

After  more  fully  describing  these  attributes,  Rogers  goes  on  to 
observe  how  they  relate  to  the  rate  of  adoption:   On  the  left  side  of 
the  chart,  we  identify  the  five  attributes  perceived  by  potential 
adopters;  on  the  right  side,  we  identify  whether  the  attribute 
increases  or  slows  down  the  rate  of  adoption. 
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I  think  that  most  of  us  would  agree  with  the  observations 
presented  by  Rogers.   In  fact,  we  might  very  well  ask,   "Why  was  it 
necessary  for  him  to  state  the  obvious?" 

But  he  goes  a  step  further  by  listing  the  number  of  empirical 
studies  that  support  and  do  not  support  each  of  the  five  general- 
izations he  makes  regarding  attributes  of  the  innovation  and  its  rate 
of  adoption. 


PERCEIVED 

EMPIRICAL  EVIDENCE 

AGREE 

DISAGREE 

TOTAL 

RELATIVE 
ADVANTAGE 

29 

14 

43 

COMPATIBILITY 

IS 

S 

07 
L  i 

COMPLEXITY 

■ 

7 
l 

16 

TRIALABILITY 

9 

4 

13 

OBSERVABILITY 

i 

9 

Not  all  of  Rogers'  findings,  and  particularly  those  related  to 
complexity,  have  been  fully  substantiated  by  empirical  evidence.   That, 
of  course,  does  not  mean  they  are  incorrect.   It  simply  means  they 
are  hypotheses  not  yet  fully  tested. 

This  review  of  Rogers'  work  brings  two  benefits:   First,  his 
findings  provide  us  a  sense  of  direction  in  finding  the  most  effective 
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ways  of  understanding  the  diffusion  process  so  that  we  can  see  that 
our  ideas  are  not  only  developed  but  adopted  as  well.   Secondly,  we 
can  see  from  reviewing  his  approach  the  need  and  the  importance  of 
empirical  evidence  to  demonstrate  the  utility  of  our  ideas  as  we 
attempt  to  have  them  adopted. 

So  far,  we  have  dealt  with  the  general  concepts  behind  the  theory 
of  the  adoption  of  innovations  as  presented  by  Rogers.   A  key  question 
that  must  be  answered  is  "Does  the  theory  hold  up  in  the  bright  light 
of  actual  problems?"   On  the  basis  of  my  experiences,  I  believe  it 
does . 

In  following  some  of  the  work  that  Rogers  and  others  have 
accomplished,  two  major  points  are  clear.   First,  trying  to  understand 
the  eventual  user  community  is  difficult  and  complex.   Indeed,  it  is 
so  complex,  we  don't  even  try,  and  are  often  faced  with  an  attitude 
of  "technology  is  the  answer,"  but  what  was  the  problem?  Second,  one 
of  the  most  important  overriding  aspects  relative  to  the  adoption  of  an 
innovation  is  communication.   And  though  some  of  the  previous 
conferences  I  have  mentioned  are  strong  indicators  of  our  understanding 
of  the  need  to  communicate,  I  see  our  communication  channel  up  to 
this  point  as  having  been  primarily  horizontal: 


STATISTICIAN  «=•■ 


<=>    DATA 
ANALYST 
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The  time  has  come,  however,  to  turn  our  attention  to  a  vertical 
channel  of  communication: 


ULTIMATE 

USERS 


x. 


Statistician 


__/ 


_    DATA 
ANALYST 


y      v 


PRODUCERS  : 

-  HARDWARE 

-  SOFTWARE 


As  vertical  communication  relates  to  users,  we  need  to  fully 
understand  that  there  are  many  users  with  different  needs  and 
different  attitudes  toward  change.   We  will  have  to  understand  the 
differences  and  how  to  deal  with  them. 

As  vertical  communication  relates  to  the  producers  of  hardware 

and  software,  there  is  an  imminent  opportunity  for  cooperation  and  the 

integration  of  various  system  components  into  complete  systems  which 

deliver  utility  to  the  users.   Here  are  examples  of  just  some  of  the 

suppliers  involved: 

.  Reprographics 

.  Graphics  software 

.  Computer  terminals 

.  Computer  mainframes 

.  Video  telecommunications 

.  Decision  support  software 

.  Telecommunications  networks 
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These  suppliers,  in  my  judgment,  must  now  summarize  and  define 
the  requirements  of  the  various  user  groups.   Hardware  and  software 
modules  can  then  be  aimed  at  targeted  needs,  so  that  the  potential 
users  of  these  systems  will  not  have  to  design  to  their  own 
requirements.   Instead,  users  will  be  able  to  choose  between  packaged 
offerings  of  the  suppliers. 

To  get  started,  suppliers  must  catalog  and  package  system 
components  that  meet  specific  communication  needs  of  users.   They  must 
then  jointly  define  the  standards  and  products  which  will  allow  a 
smooth  interchange  of  information  as  well  as  an  efficient  diffusion 
of  this  innovation. 

I  hope  this  morning  I  have  demonstrated  several  things  to  you: 

(1)  The  technology  relative  to  graphic  presentation  of  statis- 
tical information  is  sufficiently  adequate  to  handle  most 
of  the  problems  we  face.   Virtually  every  graphic  form  used 
in  this  presentation  was  initially  created  on  an  automated 
graphic  system,  primarily  by  an  individual  with  no  computer 
training  and  somewhat  limited  statistical  skills--me. 

(2)  There  is  a  body  of  knowledge  available  to  us  to  assist  us 
in  more  effectively  communicating  the  value  of  the  graphic 
form  of  information. 

(3)  The  time  has  come  to  quit  talking  to  ourselves  as  data 

analysts,  statisticians,  and  computer  scientists.   The  time 
has  come  to  bring  the  user  and  the  producer  of  hardware  and 

software  into  the  discussion. 

In  December  of  1971  an  interesting  article  appeared  in  the 

American  Statistician  titled  "innovations  in  Communication:   The 
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National  Bureau  of  Economie  Research  and  the  Computer."   The  opening 
paragraph  is  of  interest. 

"Within  the  last  decade9  the  high-speed  digital  computer  has 
become  an  essential,  but  expensive,  research  tool  in  applied 
statistics  and  quantitative  economics.   Although  the  increased 
storage  and  computational  ability  has  yielded  significant  bene- 
fits, the  extremely  fast  pace  of  change  has  been  costly  and  has 
inevitably  caused  serious  dislocations.   The  costs  of  manipu- 
lating large  data  sets,  the  rapid  turnover  of  computer  hardware, 
and  the  duplication  of  programming  efforts  have  resulted  in 
relatively  ineffective  utilization  of  raw  computer  power.   Such 
dislocations  are,  of  course,  a  feature  of  any  new  technology, 
particularly  one  which  is  interdisciplinary  in  nature. 
Innovations  occur  randomly  at  first,  creating  insolated  sets  of 
experimental  applications.   Although  the  innovations  are 
responses  to  the  requirements  of  particular  situation  over  time, 
they  achieve  wider  application  both  functionally  and  geograph- 
ically.  Eventually  we  find  individuals  and  institutions 
attempting  to  develop  procedures  designed  to  alleviate  the 
severity  of  the  new  technology's  growing  pains.   Researchers  in 
quantitative  economics  are  in  this  position  today." 
That  was  1971. 

The  same  can  be  said  of  those  of  us  interested  in  further 
utilization  of  the  graphic  form  of  information.   The  NBER  has  had 
limited  success  with  its  ventures  in  trying  to  overcome  some  of  the 
growing  pains.   As  we  enter  the  1980' s,  I  hope  we  can  commit  to  a 
similar  endeavor  and  take  advantage  of  their  successes  and  failures 
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should  we  choose  to  "develop  procedures  designed  to  alleviate  the 
severity  of  a  new  technology's  growing  pains." 

Well,  what  are  our  chances  of  doing  that?   I  personally  am  some- 
what optimistic.   My  optimism  is  best  reflected  in  the  story  told  of 
a  caveman. 

Once  there  was  a  caveman  who,  while  staring  out  at  the  moon  one 
night,  noticed  on  the  horizon  a  giant  tree  silhouetted  against  the  sky 
by  the  moon's  light.   As  he  sat  there  contemplating  the  sight,  he 
noticed  that  there  was  only  a  small  gap  between  the  top  of  the  giant 
tree  and  the  edge  of  the  moon.   The  gap  appeared  so  small  that  he 
thought  if  he  could  climb  to  the  top  of  the  tree  he  could  actually 
touch  the  moon. 

Excited  by  that  thought,  he  raced  to  the  tree,  quickly  climbed 
to  the  top,  and  reached  out  to  touch  the  moon.   When  he  realized  that 
he  could  not  actually  touch  the  moon,  he  said  to  himself,  "Well,  at 
least  I'm  closer  now  then  I've  ever  been  before." 
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ABSTRACT 
The  use  of  diagrams  to  describe  networks  is  a  wide-spread  and  classi- 
cal application  of  graphical  presentation.   The  process  of  designing  such 
diagrams  has  not  generally  been  recognized  as  a  problem,  but  those  who 
have  engaged  in  it  may  recall  that  designing  a  good   diagram  requires  both 
skill  and  tedious  effort.   In  this  paper,  we  present  a  method,  based  on  the 
statistical  technique  called  multidimensional  scaling,  to  reduce  the  tedium 
and  perhaps  even  help  improve  the  results. 

J.  INTRODUCTION 
The  use  of  diagrams   to  describe  networks   is  a  classical  application  of 
graphical  presentation.   Such  diagrams  are  widespread,  perhaps  because  net- 
works are  ubiquitous  and  a  well-drawn  diagram  seems  to  provide  the  best  way 
of  describing  a  complex  network.   In  this  paper,  we  present  a  method  for 
aiding  the  person  who  is  designing  a  network  diagram.   Figure  5A  shows  one 
of  the  four  applications  described  below.   Although  the  method  is  computer- 
based,  it  does  NOT  simply  automate  the  steps  ordinarily  used  in  designing 
the  diagram,  but  instead  uses  a  mathematical  procedure  to  avoid  one  major 
step.   Note  that  in  this  paper  we  distinguish  sharply  between  the  network 
itself,  which  is  conceived  as  an  abstract  structure,  and  the  diagram  which 
is  used  to  portray  it.   Diagrams  of  the  same  network  can  differ  in  many 
ways:   location  of  the  nodes,  the  paths  followed  by  the  edges,  the  size  and 
shape  of  the  symbol  at  each  node,  the  thickness  and  character  (solid, 
broken,  dotted,  etc.)  of  the  lines  indicating  the  edges,  the  content  and 
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placement  of  the  labels,  and  so  forth. 

To  illustrate  how  widely  network  diagrams  are  used,  we  note  a  number 
of  applications: 

(1)  The  relationship  between  goals  and  objectives,  e.g.,  see  Sain  and 
Urhan  (1975)  and  DiCesare,  et  al  (1976); 

(2)  The  relationships  among  tasks  to  be  performed,  as  in  PERT  charts 
and  the  like,  e.g.,  see  Moore  and  Taylor  (1977)  and  Greenwood 
(1969); 

(3)  The  relationship  among  alternative  strategies,  e.g.,  see  Lipinski 
(1978)  and  Coates  (1976); 

(A)   Electrical  circuit  diagrams; 

(5)   Flow  diagrams  for  computer  programming. 

See  Fig.  1  on  following  page. 

By  way  of  example,  Figure  1  shows  a  network  diagram  taken  from  Coates 
(1976).   (It  is  his  Figure  17,  and  is  labelled  "Signed  digraph  for  air  pol- 
lution abatement  strategies  ...".)   We  are  not  aware  of  any  published 
discussion  of  the  "conventional"  process  of  designing  such  diagrams,  but 
based  on  personal  experience  and  informal  discussion,  we  believe  that  most 
designers  go  through  an  extensive  trial-and-error  process,  involving  a  large 
number  of  partial  diagrams.   In  each  earlier  step,  typically,  a  few  nodes 
and/or  edges  are  added,  and  the  whole  diagram  rearranged.   In  the  later 
steps,  labelling  often  enters  the  picture. 

Not  only  is  the  process  time-consuming,  but  considerable  skill  is  re- 
quired to  produce  a  good  diagram.   The  need  for  a  better  procedure  is  in- 
dicated by  the  fact  that  in  a  recent  book  on  "Methodology  for  Large-Scale 
Systems"  (Sage,  1977),  the  very  first  specific  method  described  is  for 
this  purpose.   (Sage's  method,  though  possibly  useful,  seems  to  us  to  have 
several  limitations.) 

Today,  of  course,  it  is  possible  to  reduce  the  tedium  of  the  conven- 
tional procedure  by  the  use  of  interactive  computer  graphics  (for  those 
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with  access  to  the  appropriate  equipment).   That  is  not  the  direction  in 
which  we  have  moved.   Instead,  we  describe  a  fully  algorithmic  method  which 
starts  from  a  description  of  which  nodes  are  connected  to  one  another  and 
yields  suggested  locations  for  the  nodes.   We  call  this  the  MDS-D  method, 
since  it  uses  a  procedure  called  multidimensional  scaling  (MDS)  to  help 
create  a  diagram  (D) .   Several  widely  used  computer  programs  are  available 
to  perform  MDS  (also  known  as  smallest  space  analysis,  or  SSA) ,  since  this 
is  a  standard  method  of  data  analysis  and  statistics.   We  will  give  a  brief 
description  of  MDS  in  Section  3. 

We  consider  it  entirely  appropriate  to  change  the  locations  provided 
by  MDS-D  as  much  as  desired.   MDS-D  is  best  used  to  aid  and  supplement  the 
creative  design  process  of  designing  a  good  diagram,  rather  than  to  supplant 
it.   For  a  skillful  designer,  MDS-D  offers  a  reduction  of  effort;  for  a  less 
skillful  one,  it  may  offer  some  improvement  of  the  final  result. 

A  primary  objective  of  network  diagrams  is  to  clearly  describe  the  net- 
work, and  the  system  it  represents.   Common  experience  shows  that  some  dia- 
grams are  more  successful  than  others,  and  it  is  easy  enough  to  make  terrible 
diagrams.   Although  judgment  of  quality  is  necessarily  a  subjective  process, 
quality  could  be  studied  by  the  methods  of  psychology.   While  we  are  not 
aware  of  any  such  studies  in  this  area,  there  have  been  studies  of  other 
visual  displays  such  as  scatter  diagrams  which  are  used  in  statistics,  e.g., 
see  Cleveland  (1978). 

II.       THE  MDS-D  METHOD,    PART    1:       FINDING    THE   GRAPH-THEORETIC   DISTANCES 
The  input  to  the  MDS-D  method  is  a  description  of  the  network.   To  use 
the  computer  program  which  we  wrote  to  carry  out  the  method,  the  network  must 
be  described  in  the  following  form,  where  the  nodes  are  referred  to  by  serial 
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numbers  from  1  to  n: 

list  of  nodes  connected  to  node  1; 
list  of  nodes  connected  to  node  2; 


list  of  nodes  connected  to  node  n. 

The  entries  in  each  list  may  be  in  any  order,  and  may  contain  repeated 
entries  (to  indicate  multiple  edges).   A  node  may  be  connected  to  itself 
(since  such  circular  edges  are  not  uncommonly  encountered  in  diagrams). 
Whether  or  not  edges  are  directed    (i.e.,  have  an  arrowhead  to  indicate  di- 
rection), the  method  and  the  computer  program  ignore  direction  and  treat 
edges  as  undirected.   For  this  reason,  the  edge  between  i  and  j  must  appear 
twice  in  the  input  (if  i  4   j),  in  the  list  for  node  i  and  in  the  list  for 
node  j.   This  redundancy  permits  the  program  to  check  for  consistency,  which 
we  consider  a  desirable  form  of  error-checking. 

After  error-checking,  the  next  step  of  the  method  is  to  transform  the 
network  description  from  the  input  form  (chosen  for  human  convenience)  into 
a  form  which  is  convenient  for  computer  manipulation.   An  obvious  choice  is 
a  symmetric  matrix  A  in  which  the  entry  a.,  indicates  the  "nature  of  the 
relationship"  between  nodes  i  and  j . 

The  MDS-D  method  is  based  on  the  concept  (described  precisely  below)  of 
graph-theoretic  distance  d..  between  nodes,  which  must  be  calculated  from 
the  a . . .   Hence  the  natural  choice  for  a . .  is  the  "distance  along  the  edge" 
connecting  i  and  j.   Specifically,  we  define  these  distances  a^-  as  fol- 
lows.  The  distance  a.,  from  i  to  i  is  0  (regardless  of  the  network).   Next 
suppose  i  f   j .   If  there  are  no  edges  directly  between  i  and  j ,  the  distance 
a.,  is  infinity  (represented  in  the  computer  by  a  large  number).   If  there 
are  k  edges  (with  k  ^  0),  two  alternative  formulas  may  be  used.   In  one, 
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a...    =  1;  in  the  other,  a..  =  1/k. 

It  would  be  possible  to  extend  the  method  to  cover  different  kinds   of 
edges,  such  as  weak  relationships  commonly  indicated  by  dashed  or  dotted 
lines,  and  strong  relationships  indicated  by  double,  triple  or  thick  lines. 
For  this  purpose,  weights   would  be  attached  to  the  edges  in  the  input.   If 
this  were  done,  the  definition  above  would  have  to  be  generalized  in  some 
suitable  manner. 

A  path  in  the  network  means  a  sequence  of  nodes.   The  length  of  a  path 
is  defined  in  the  obvious  way,  as  the  sum  of  the  distances  between  pairs  of 
adjacent  nodes.   The  graph-theoretic  distance  between  i  and  j  is  defined 
as  the  minimum  path  length  for  any  path  connecting  i  and  j . 

The  next  step  in  the  method  is  to  calculate  the  matrix  D  of  graph- 
theoretic  distances  between  the  nodes.   Our  computer  program  uses  a  well- 
known  trick  for  finding  D,  which  is  based  on  a  matrix  operation  analogous 
to  matrix  multiplication.   Ordinary  matrix  multiplication  of  X  and  Y  is 
described  by  5 -x . .y . ,  ,  a  sum   of  products.      Define  another  operation,  say 

X*Y,  based  on  a  minimum  of  sums,    namely  min.(x..  +  y.,  ).   Then  the  entries 

]   i]   'jk' 

in  A*A  give  the  minimum  two-step   path  lengths.   If  A*   indicates  the  ith 
power  of  A  under  the  *  operation,  its  entries  indicate  the  minimum  i-step 
path  lengths,  so  D  =  min.CA*1).   Here  A*1  indicates  A  itself,  and  i  only 
needs  to  range  from  1  up  to  the  number  of  steps  in  the  longest  path,  name- 
ly, n  -  1. 

Still  greater  efficiency  in  calculating  D  is  attained  by  a  second 

9  2 

trick.   We  define  a  slightly  different  operation,  X#z  =  min(X,X*  ).   This 

2   2 
gives  the  shortest  lengths  for  paths  with  one  or  two  steps.   Then  (X//  )# 

is  easily  seen  to  give  the  shortest  lengths  for  paths  with  up  to  four 

steps.   After  k  iterations,  we  obtain  the  shortest  lengths  for  paths  with 
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up  to  2*-  steps.   Thus  at  most   log~(n)   iterations  are  necessary.   In  prac- 
tice, our  computer  program  compares  each  iterate  with  the  preceding  one, 
and  stops  when  the  two  iterates  are  equal. 

We  digress  for  a  moment  to  point  out  an  interesting  relationship  be- 
tween the  *  operation  and  the  "circle"  operation,  introduced  as  a  special- 
purpose  "clean-up"  procedure  prior  to  using  MDS  by  Kendall  (1971  a,b). 
Working  with  non-negative  matrices,  he  bases  XoY  on  the  sum  of  minima, 

namely,  its  entries  are  given  by  2-  min(x..y   ).   It  is  interesting  that 

3       13  jk 

these  two  "opposite"  procedures  both  are  appropriate  in  such  similar  con- 
texts, namely,  as  a  pre-processing  step  prior  to  use  of  MDS,  though  the  mo- 
tivations for  using  them  are  very  different. 

Normally,  networks  requiring  diagrams  are  connected.   If  it  is  neces- 
sary to  provide  a  diagram  for  a  disconnected  network,  it  would  normally  be 
sufficient  to  place  diagrams  of  the  connected  components  together  on  the 
same  page.   The  placement  of  such  component  diagrams  in  a  pleasing  manner 
would  usually  be  a  simple  task  for  human  judgment.   For  this  reason,  it  is 
sufficient  to  limit  our  method  to  connected  networks,  which  is  fortunate, 
since  the  method  in  Part  II  might  not  handle  disconnected  networks  well. 

The  computer  program  checks  for  connectedness  by  examining  the  matrix 
D.  It  is  easy  to  see  that  D  is  disconnected  if  and  only  if  it  contains  an 
infinite  distance  (in  principle),  or  a  sufficiently  large  value  (in  a  com- 
puter implementation).  In  fact,  in  general  D  will  have  block  diagonal 
form,  up  to  identical  row  and  column  permutations.  Each  block  corresponds 
to  a  component  of  the  network  and  contains  finite  distances,  while  the 
rest  of  D  contains  infinite  values. 
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TJX.       The   MDS-D  METHOD,    PART    2:      MULTIDIMENSIONAL   SCALING 

Multidimensional  scaling,  or  MDS,  is  a  standard  method  of  data  analy- 
sis and  statistics.   The  essential  kernel  of  this  method  is  finding  a  con- 
figuration of  points  from  the  matrix  of  distances.   While  this  can  be  done 
in  simple  cases  by  hand  with  pencil,  paper,  ruler  and  compass,  the  usual 
cases  are  much  harder  for  many  reasons.   With  real  data,  the  distances  are 
generally  contaminated  by  a  substantial  amount  of  random  error.   The  values 
of  the  contaminated  distances  may  not  be  known,  but  only  their  rank  order 
(or  their  values  after  an  unknown  monotonie  transformation  has  distorted 
them).   Frequently  the  data  are  incomplete. 

The  configuration  of  points  may  not  lie  in  two-dimensional  space  (the 
plane),  but  in  three  or  four  dimensional  space,  or  even  higher.   (However, 
the  use  of  more  than  four  dimensions  is  rare.)   Usually,  the  number  of  di- 
mensions is  not  known  in  advance,  but  must  be  estimated  from  the  data. 
Finally,  in  the  majority  of  applications,  it  is  not  known  in  advance  whether 
or  not  it  is  sensible  to  think  of  the  data  values  as  distances  between 
points;  the  idea  of  describing  them  this  way  is  simply  a  hypothesis  whose 
validity  must  be  assessed. 

Descriptions  of  MDS  at  many  different  levels  are  available  in  a  variety 
of  books  and  articles.   We  mention  only  a  few:   Kruskal  and  Wish  (1978), 
Carroll  and  Kruskal  (1978),  Carroll  and  Levine  (1979),  Cliff  (1973),  Green 
and  Carmone  (1970),  Green  and  Rao  (1972),  Green  and  Carroll  (1976)  and 
Kruskal  (1977). 

There  are  quite  a  number  of  publicly  available  computer  programs  for 
MDS,  as  discussed  in  detail  in  Kruskal  (1977).   Fortunately,  for  the  appli- 
cation in  this  paper,  the  so-called  "classical"  method  of  MDS  is  entirely 
adequate,  and  this  can  be  programmed  quite  easily  and  briefly,  as  described 
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below.   For  the  applications  in  this  paper,  however,  we  made  use  of  the 
program  most  conveniently  available  to  us,  KYST-2  (Kruskal ,  Young,  and 
Seery  (197?)). 

The  second  part  of  the  MDS-D  method  is  to  use  the  matrix  D  of  graph 
theoretic  distances  as  input  to  MDS,  and  let  it  calculate  the  configura- 
tion of  points  in  two-dimensional  space  whose  Euclidean  distances  best 
match  the  given  graph-theoretic  ones.   (For  a  discussion  of  what  "best" 
means,  and  the  many  different  interpretations  available,  see  the 
literature  referred  to  above.) 

It  should  clearly  be  understood  that  we  are  free  to  rotate  and  turn 
over  (i.e.,  mirror- image)  the  configuration  provided  by  MDS.   (Since  MDS 
depends  only  on  interpoint  Euclidean  distances,  which  are  not  affected  by 
rotation,  the  quality  of  matching  is  not  affected  by  rotation.) 
Generally,  one  or  two  positions  will  be  found  which  are  most  pleasing. 

All  that  MDS  and  the  method  of  this  paper  provide  are  locations 
for  the  nodes.   All  the  remaining  steps,  such  as  choosing  size  or  shape 
of  nodes,  paths  for  the  edges,  etc.,  remain  to  be  done  by  conventional 
methods. 

In  our  computer  program,  we  provided  the  lower  half  of  D  (without 

the  diagonal)  to  the  MDS  program  KYST-2,  and  accepted  the  default 

choices  for  the  many  options  in  that  program  except  as  follows: 

LOWERHALFMATRIX,  DIAGONAL  =  ABSENT 

REGRESSION  =  POLYNOMIAL  =  1,  REGRESSION  =  NOCONSTANT 

ITERATIONS  =  125 

STRMIN  =0.0,  SRATST  =1.0 

The  first  line  above  controls  the  form  of  the  data.   The  second  line 

controls  the  type  of  regression  to  be  performed,  i.e.,  linear  regression 

with  zero  constant  term.   The  third  line  gives  the  maximum  number  of 

iterations;  if  not  specified,  only  50  would  be  permitted.   The  fourth 
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line  determines  a  definition  of  convergence  which  is  stricter  than  the 
default  case. 

We  also  experimented  with  defining  the  distance  d..  between  two 
nodes  to  be  either  one  (if  they  are  adjacent)  or  "infinity"  (if  they 
are  not  directly  connected),  and  using  the  REGRESSION=ASCENDING  option 
to  perform  an  ascending  monotonie  regression.   However,  we  found  the 
method  described  above  to  give  more  desirable  diagrams. 

Finally,  for  those  who  may  want  to  write  their  own  program,  we 
describe  very  briefly  the  simplest  method  to  program,  namely  the 
classical  method  of  MDS,  which  was  rescued  from  obscurity  by  Torgerson 
(1958,  Chapter  11). 

To  perform  classical  scaling  in  this  context,  some  preliminary 
steps  which  are  needed  in  other  contexts  may  be  skipped,  because  the 
input  matrix  D  has  appropriate  form.   (These  steps  consist  of  a  pre- 
liminary transformation  of  the  input  values  to  make  them  "distance-like", 
and/or  finding  an  appropriate  constant  and  adding  it  to  every  value.)   In 
this  context,  the  first  step  is  to  square  each  value  of  D.   We  call  the 

resulting  matrix  D2.   Next  find  the  grand  mean  of  D2,  i.e.,  the  mean  of 

2 
all  n  entries,  and  subtract  it  from  each  one  of  the  entries   to  form  D3. 

Then  find  the  average  of  every  row  of  D^:   call  the  average  of  the  i-th 
row  d-,  and  note  that  d-  is  also  the  average  of  the  i-th  column  since 
D3  is  symmetric.   Next,  subtract  d^  +  d.  from  the  (i,j)  entry  of  D3  and 
similarly  for  every  entry,  and  call  the  resulting  matrix  D / .   Finally, 
let  B  =  -(1/2)D4. 

Before  proceeding  with  the  next  steps,  we  explain  the  theorem  of 
Young  and  Householder  (1938)  which  is  the  mathematical  basis  for  clas- 
sical MDS.   Suppose  there  are  n  points  in  Euclidean  space  (e.g.,  the 
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plane  or  three-dimensional  space).   Suppose  these  points  are  centered  at 

the  origin,  i.e.,  2x./n  =  0.   Suppose  D  is  the  matrix  of  distances 

between  these  points,  i.e.,  the  (i,j)  entry  of  D  is  the  distance  between 

x.  and  x..   Then  according  to  the  theorem,  the  entries  of  B  are  the  inner 
i      J 

products  of  the  x.,  i.e.,  b..  =  x.-x.  for  every  i  and  i.   Stating  the  same 

T  T 

fact  in  matrix  form,  B  =  XX  where  x.  is  the  i-th  row  of  X  and  X  means 

l 

the  transpose  of  X. 

T 
Thus,  we  wish  to  find  X  such  that  B  =  XX  ,  i.e.,  to  factor  B.   The 

first  step  in  factoring  is  to  find  the  eigenvectors  and  eigenvalues  of  B. 

This  can  be  done  by  using  any  standard  eigenvector  routine.   (Note  that 

many  such  routines  only  work  for  symmetric  matrices.   Fortunately,  B  is 

symmetric.)   Specifically,  let  E  be  the  matrix  whose  columns  are  the 

eigenvectors  and  let  A  be  the  diagonal  matrix  of  eigenvalues.   Most 

eigenvector  routines  yield  E  (or  ET)  and  A .   By  definition  of  eigen- 

T 
vectors,  BE  =  EA.   Multiplying  the  E  on  the  right,  and  using  the 

T 
fact  that  EE  =  the  identity  matrix,  we  find  the  well-known  result  that 

T 
B  =  EAE  . 

1/2 
If  the  eigenvalues  are  all  positive,  we  can  form  A    ,  and  then  B  = 

1/2     1/2  T  1/2 

(E  A    ) (E  A    )  ,  so  X  =  E  A     would  appear  to  provide  the  solution 

we  want.   Unfortunately,  there  are  two  problems.   First,  the  input  matrix 

D  cannot  be  expected  to  satisfy  the  hypothesis  of  the  theorem  (i.e.,  it 

consists  of  graph- theoretic   distances,  not  Euclidean  distances  in  the 

plane),  so  that  some  of  the  eigenvalues  may  be  negative.   Second,  in  this 

use  of  classical  scaling,  we  want  the  rows  of  X  to  give  us  the  coordinates 

of  points  in  the  plane.   This  means  we  want  X  to  have  only  two  columns,  so 

each  row  has  only  two  coordinates.   Fortunately,  we  can  dispose  of  both 

problems  with  one  solution. 
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In  accordance  with  standard  custom,  we  assume  that  the  diagonal 
entries  of  A  are  arranged  in  decreasing  order  of  size.   Thus  the 
largest  eigenvalues  come  first,  and  the  negative  eigenvalues  (if  any) 
come  last.   (In  practice,  for  the  type  of  application  described  in  this 

paper  there  will  be  at  most  a  few  negative  entries,  and  these  will 

1/2 
have  small  magnitude.)   The  solution  is  to  replace  A     by  A 9 ,  which 

1/2 
consists  only  of  the  first  two  columns  of  A    .   To  form  A?  requires 

only  that  the  first  two  eigenvalues  be  positive,  since  the  remaining 

eigenvalues  are  discarded  anyway.   Then  we  form  X  =  EA„,  which  has 

only  two  columns,  and  the  job  is  finished. 

IV.       SOME   PRACTICAL  APPLICATIONS 

For  our  first  application  we  choose  a  good  diagram  from  Lipinsky 
(1978)  which  shows  the  growth  and  dominant  uses  of  corn  in  a  projected 
version  of  the  U.S.  agricultural-industrial  system. 


See  Figures   2A  and  2B  on  following  pages 


Figure  2A  shows  the  diagram  created  by  our  procedure,  using  input  data 
derived  from  Figure  2B,  and  Figure  2B  shows  the  original  figure 
(specifically,  their  Figure  4,  labelled  "Hypothetical  corn-to-ethanol 
system,  integrated  with  conventional  utilization  of  corn").   For  the 
purpose  of  comparison,  we  have  drawn  our  diagram  in  as  similar  a 
fashion  as  possible  to  theirs.   The  symbol  £><)  indicates  a  valve  con- 
trolling allocation  of  product  flow  among  destinations.   Thus,  a 
larger  amount  of  corn  stover  can  be  diverted  from  the  field  to  feed 
cattle  when  desired;  this  variability  plays  an  important  role  in 
Lipinsky  (1978).   (Note  that  corn  stover  refers  to  the  entire  plant 
except  the  kernels,  i.e.,  the  cob,  stalk,  husks,  and  leaves.   Stillage 
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FIG.  2A  Network  diagram,  based  on  Lipinsky  (1978), 
designed  with  the  aid  of  MDS-D. 
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FIG.  2B  Network  diagram,  as  it  appears  in  Lipinsky  (1978) 
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is  the  mash  from  an  alcoholic  fermentation  in  a  still  after  removal  of 
the  alcohol. ) 

At  first  glance,  one's  impression  is  that  Figure  2B  is  considerably 
neater  than  Figure  2A.   In  our  opinion,  this  is  basically  an  esthetic 
reaction,  which  reflects  the  arrangement  of  boxes  on  a  few  horizontal 
lines.   For  a  thorough  study  of  this  network,  however,  it  is  not 
obvious  to  us  which  diagram  is  superior.   It  appears  to  us  that  psycho- 
logical experiments  would  be  necessary  to  reach  such  a  decision.   It  is 
a  pleasant  surprise  that  the  locations  provided  by  a  simple  automatic 
procedure  should  do  reasonably  well.   Note  that  although  the  two  diagrams 
resemble  mirror-images  of  one  another  to  some  extent  (using  a  vertical 
mirror),  they  differ  substantially. 

Quite  apart  from  the  merits  of  the  diagrams,  there  is  the  question 
of  comparing  the  effort  which  is  required  to  design  diagrams  by  the  two 
procedures.   The  work  involved  in  designing  a  diagram  in  the  conventional 
way  has  seldom  been  recorded  and  described.   (Sometimes  the  design 
process  is  intertwined  with  the  logically  prior  step  of  deciding  what 
nodes  and  which  edges  to  include.)   Nevertheless,  it  may  be  worthwhile 
for  us  to  use  this  application  to  document  more  precisely  the  steps 
involved  in  our  procedure. 

First  the  nodes  were  numbered  (from  1  to  16,  in  an  arbitrary 
order),  and  the  list  of  edges  to  and  from  each  node  was  prepared  in  a 
form  similar  to  that  shown  in  Figure  3. 


See  Figure   3  on  following  page 


This  input  was  fed  into  our  computer  program  which  computed  the  matrix 
D  of  graph-theoretic  distances  as  described  above.   This  matrix  was 
automatically  passed  on  to  the  MDS  program  KYST-2A  along  with  suitable 


36 


VERTEX  EDGES  TO  &  FROM 

1  k 

2  i| 

3  6   7 

4  7   12 

5  6   8   9   11 

6  3  13  10    5 

7  3   ^  10 

8  10   5 

9  10   5 

10  ik        9   8    7 

11  5  14 

12  111 

13  15   6  16 

l2*  15  10  11   12 

15  13  Ik     16 

16  15  13 

The  vertex  in  the  left  column  is  connected  to 
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FIG,  3.   Input  for  Corn  Biomass  Network 
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instructions  (as  indicated  above).   KYST  displays  locations  for  the  nodes 
in  two  forms:   a  printer-plot  and  numerical  coordinates.   In  this  case, 
tracing  paper  was  used  to  rotate  the  printer  plot  to  a  suitable  orienta- 
tion.  Boxes  were  drawn,  centered  approximately  at  the  appropriate  nodes, 
the  boxes  were  labelled,  and  arrows  drawn.   This  rough  plot  was  given 
to  the  drafting  department,  which  produced  Figure  2A. 

When  drawing  the  boxes,  all  but  four  of  them  were  placed  with  their 
centers  directly  on  the  positions  provided  by  MDS-D.   Two  of  them  were  displaced 
very  slightly  for  esthetic  reasons,  and  two  of  them  were  displaced  some- 
what more:   "Corn  Field"  was  raised  (by  roughly  1.5  box  heights)  and 
"Export"  was  lowered  by  a  similar  amount.   No  doubt  the  diagram  could  be 
improved  still  further  by  other  changes.   We  consider  any  degree  of  change 
entirely  appropriate,  as  indicated  earlier,  even  total  abandonment  in 
favor  of  an  alternative. 

Another  application  is  based  on  Sain  and  Uhran  (1975),  in  which  the 
flow  of  criminal  cases  through  the  judicial  system  of  Saint  Joseph  County 
is  described. 


See  Figures    4A   and   4B  on   following  pages 

Figure  4A  shows  our  figure,  and  Figure  4B  shows  the  original  figure 
(specifically,  their  Figure  5,  labelled  "Saint  Joseph  County,  direct 
arrest").   Again,  for  the  purposes  of  comparison  we  have  drawn  our  diagram 
in  as  similar  a  fashion  as  possible  to  theirs. 

In  this  case,  slightly  greater  liberties  have  been  taken  with  the 
locations  provided  by  MDS-D.   The  MDS-D  output  was  rotated  and  reflected 
(as  in  a  mirror) .   We  also  reduced  somewhat  the  size  of  the  upper  loop  and 
expanded  somewhat  the  two  lower  wings.   Also,  the  mirror  symmetry  between 
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COURT  SYSTEM 


FIG.  4A  Network  diagram,  based  on  Sain  and  Uhran  (1975), 
designed  with  the  aid  of  MDS-D. 
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FIG.  4B  Network  diagram,  as  it  appears  in  Sain  and 
Uhran  (1975). 
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the  two  wings  was  improved  somewhat  at  one  stage,  and  then  unfortunately 
degraded  somewhat  at  a  later  stage  due  to  space  limitations. 

A  further  application  is  based  on  Moore  and  Taylor  (1977),  in  which 
the  organization  of  a  complex  research  and  development  process  is 
discussed. 


See  Figures   5  A  and  5B  on   following  pages 
Figure  5A  shows  our  figure,  and  Figure  5B  shows  the  original 
(specifically,  their  Figure  1,  labelled  "GERT  Network  of  Multiteam, 
Multiproject  R&D  Process").   In  this  case  we  have  taken  the  liberty 
of  using  their  edges    (rather  than  their  nodes)  as  our  nodes,  and  have  not 
kept  to  the  spirit  of  the  original  figure  in  several  ways. 

V.      A  MATHEMATICAL  APPLICATION 

The  earliest  use  of  MDS  as  an  aid  to  drawing  networks,  so  far  as  we 
know,  was  by  Dr.  Shen  Lin  (oral  communication)  in  the  context  of  pure 
mathematics.   Although  his  primary  interest  was  in  certain  graph- 
theoretic  questions  (Given  two  graphs,  are  they  isomorphic?   if  so, 
what  is  an  isomorphism?),  he  used  MDS  somewhat  as  we  have  to  pursue  these 
questions.   He  also  noted  the  potential  value  of  MDS  simply  for  drawing 
complicated  networks. 

Using  classical  MDS  (as  described  above,  though  not  necessarily  in 
only  two  dimensions),  he  applied  the  method  we  have  described  to  the 
graphs  shown  in  Figure  6,  which  look  quite  similar  (and  which  are  not 
easily  proved  nonisomorphic  by  classical  graph-theoretic  means). 
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FIG.  5A  Network  diagram,  based  on  Moore  and  Taylor 
(1977),  designed  with  the  aid  of  MDS-D. 
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FIG.  5B   Network  diagram,  as  it  appears  in 
Moore  and  Taylor  (1977). 
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See  Figures   6   and   7  on   following  pages 

Figure  6A  yielded  the  diagram  shown  in  Figure  7,  which  is  a  stikingly  dif- 
ferent and  perhaps  clearer  way  of  drawing  the  same  graph.   On  the  other 
hand,  the  graph  of  Figure  6B  yielded  five  groups  of  points,  (namely  1,  6, 
11,  16;  2,  7,  12,  17;  etc.)»  where  the  points  in  each  group  fall  exactly 
in  the  same  position  and  the  five  positions  lie  on  a  circle.   Of  course 
this  shows  that  the  two  graphs  of  Figure  6  are  not  isomorphic. 

Furthermore,  when  he  applied  classical  scaling  to  a  graph  isomorphic 
to  Figure  6A  (obtained  by  scrambling  the  point  labels  on  this  graph), 
Figure  7  was  again  obtained,  but  with  the  labels  suitably  scrambled. 
From  these  two  figures  it  is  easy  to  read  off  an  isomorphism  between 
the  two  graphs . 

In  the  case  of  Figure  6B,  the  matter  is  not  quite  as  simple. 
Although  a  graph  isomorphic  to  this  one  yields  the  same  figure  as  before, 
with  the  labels  suitably  permuted,  it  is  not  possible  to  read  off  an 
isomorphism  from  the  two  figures  because  several  labels  correspond  to 
each  configuration  point.   However,  by  using  classical  MDS  in  three 
dimensions,  this  difficulty  is  resolved.   Furthermore,  it  is  interesting 
to  note  that  Lin  found  the  classical  MDS  solution  in  four   dimensions  to 
be  just  the  Cartesian  product  of  five  points  in  a  circle  (the  first  two 
dimensions)  times  four  points  in  a  circle  (the  last  two  dimensions). 

VI.       CONCLUSION  AND   SUGGESTIONS 

We  reach  the  following  conclusions. 

(1)  The  application  of  multidimensional  scaling  to  the  graph- 
theoretic  distances  in  a  network  can  be  helpful  in  designing  a 
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MDS-D  VERSION  OF  FIG.  6A 
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network  diagram,  by  providing  locations  for  the  nodes  of  the 
network. 

(2)  Once  a  user  is  tooled  up  to  apply  this  method,  it  may  sub- 
stantially reduce  the  effort  needed  to  design  a  good  network 
diagram. 

(3)  The  locations  provided  by  this  method  are  often  quite  reason- 
able, though  users  should  feel  free  to  modify  them  as  much  as 
desired . 

(4)  This  method  is  primarily  useful  as  a  rapid,  easy  way  of  provid- 
ing a  reasonable  set  of  locations  for  the  nodes.   We  certainly 
do  NOT  claim  optimality  of  any  kind  for  this  method. 

(5)  We  believe  that  the  method  may  be  most  helpful  for  networks 
which  are  highly  multiply  connected,  and  may  have  less  value 
in  the  converse  case,  e.g.,  for  a  tree. 

For  use  of  this  method,  we  make  the  following  recommendations: 

(a)  Apply  a  suitable  variety  of  metric,    as  opposed  to  nonmetric, 
multidimensional  scaling.   The  use  of  the  MDS  programs  KYST- 
2A,  KYST,  or  M-D-SCAL  5  with  the  options  shown  above  is  one 
suitable  approach.   The  use  of  classical  MDS  is  another. 

(b)  In  practice,  diagrams  often  have  several  kinds  of  nodes  (such 
as  the  switch  nodes  in  Figure  2,  and  nodes  like  the  small 
round  circles  in  Figure  3  which  serve  only  as  juncture  points 
to  reduce  the  number  of  direct  edges  between  the  major  nodes). 
Our  informal  experimentation  suggests  that  better  results  are 
obtained  by  omitting  such  minor  nodes  when  applying  our  method, 
and  simply  placing  them  later  by  hand. 
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(c)  It  may  be  desirable  to  omit  some  major  nodes  when  applying 
our  method,  and  to  place  them  later  by  hand.   Notably,  when 
a  string  or  a  tree  hangs  from  the  multiply  connected  portion 
of  the  graph,  it  may  be  more  convenient  to  omit  the  nodes  in 
the  string  or  the  tree,  and  this  omission  should  yield 
results  as  good  as  those  obtained  by  not  omitting  them. 
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DEVELOPMENTS    IN   STATISTICAL   GRAPHICS    1960-1980 

Alan   Julian   Izenman 

Colorado    State  University 

ABSTRACT 

This   paper   presents   an   historical   perspective  on  the  development    of 
graphical  methods   of    data   analysis  during    the   twenty   years   from   1960   to 
1980.      Three  main  periods   of   development   are    identified:      pre-1960;    the 
1960's;    and    the   1970' s.      Discussion  of   various   types   of   univariate  and 
multivariate  data   graphics    is   given,    including  methods   for   probability 
plotting.      Possible   future  directions   for    statistical   graphics  are   also 
mentioned . 

Key   words:      probability  plots,    Q-Q   plots,    semigraphical   displays, 
scatter-plots,    residual   analysis,    outliers,   multivariate  data   analysis, 
computer   graphics,    color-coded   graphics,    color    statistical  maps,    history 
of    statistics. 

J.       INTRODUCTION 

In  recent   years   there   has   been  a   noticeable  rise  of   concern  among  mem- 
bers  of   the   statistical   profession  about    the   nature  of   and    standards   for 
statistical   graphics,    with   special    sessions    (1976   Boston  ASA  Meeting, 
1978    San   Diego   ASA  Meeting)    and   conferences    (1977    Sheffield    Conference, 
1978    1st    Social   Graphics    Conference    in   Leesburg,    Virginia)    on   the   subject 
now  taking   place  on  a   regular    basis.      A  result    of   all    this   concern  has 
been  an   improvement    in   the   formulation    (Tuf te    (1978),    Cox    (1978),    Fien- 
berg    (1979))    and    evaluation    (Wainer   and   Reiser    (1976),    Wainer    (1977))    of 
rules   for   drawing   graphs. 

A  great   deal   of    the   credit    for   this   "movement    to   better   graphics"    is 
due  to    the  widespread   availability   of    electronic   high-speed   computers   and 
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their   associated   graphics    systems.      With  computer    hardware   becoming   cheaper 
every   year,    the   type  of    exclusivity    in   this   field    that   once   belonged    to   re- 
search establishments    such  as   Bell   Laboratories   appears    to   be   fast   disap- 
pearing.     Furthermore,    a   number   of    software  computer   packages   have  been 
specifically  created    for    the   purpose  of    graphing   data   or    their    summaries 
(Hoaglin  and   Velleman    (1978),    Hoaglin  and  Wasserman(1975) ) . 

Most   published   research    in  the  area   of    statistical   graphics   has    so    far 
been   concentrated   on  providing  various   types   of    two-dimensional   displays. 
Some  work  has   been  done  on   three-dimensional   graphics,    but    higher   numbers 
of   dimensions   are   impossible   to  visualize,    and   therefore,    need    special 
tools   for    specific    types   of   analysis.      Recently,    several  major   advances 
have  been  made   in   this   direction.      The   two  most    prominent    of   these  are 
the  Fisherkeller ,    Friedman  and   Tukey    (1974)    interactive  data   display  and 
analysis    system  called   PRIM-9,    and   the  use  of   color-coded   graphics   to    en- 
hance visual   comprehension   of    large  data    sets   and    statistical  maps    (Fien- 
berg    (1979),    Wainer   and   Francolini    (1978)). 

The  purpose  of    this   article    is   to   trace   the  development   of    statistical 
graphics   during   the   last   two   decades.      As   should    soon   become   apparent,    a 
great   many   of    the  contributions   to   the   subject   originated   with   the   statis- 
tical  research  groups   at   Bell    (Telephone)    Laboratories,    and    their   work, 
which   started   appearing    in   the   I960' s ,    still   continues   to   have   a  very    im- 
portant  catalytic    effect   on   the  general  usage  of   data   graphics   today. 

In   Section   II   of    this   paper,    we  give  a    brief    survey   of    the   state  of 
statistical   graphics   prior    to    1960.      The  next   decade,    1960-1969,    saw   the 
emergence  of   Bell   Laboratories  as   a  major   center    for   data   analysis   and 
statistical   graphics,    and    its  main   features   are  described    in   Section   III. 
The   sudden  diversity   of   contributions   to    the   subject   during   the  next    ten 
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years,    1970-1979,    is   characterized    in   Section   IV   by    a   general   desire   to   work 
towards   a    theory   or   philosophy   of    statistical   graphics.     Finally,    in   Section 
V,    possible  directions   and    future  developments   of    the   subject   are   briefly 
discussed . 

II.       STATISTICAL   GRAPHICS   PRIOR    TO    1960 
A  number   of   papers   have  appeared   recently   describing   the   origins   and 
subsequent   development   of    "the   graphical   method   of   representing   data."      We 
shall   not   repeat    these  details   here;    the    interested   reader   may    instead   con- 
sult   the  historical   accounts   by   Royston    (1970),    Beniger   and   Robyn    (1978), 
and    Cox    (1978)    for    such    information,    and   also   the  useful   bibliography   by 
Feinberg   and   Franklin    (1975). 

A  most    important    contribution   to    the   subject,    however,    does   not    seem   to 
have   been  given   the   historical    significance    it   probably  deserves.      C. 
Daniel's    (1959)    paper   on   half-normal   probability   plotting   and    its   applica- 
tion  to   the  analysis   of    2      factorial    experiments    (k   factors    each  at    two 
levels)    and    their   fractions   had   a   profound    influence  on   the  course  of    sta- 
tistical  graphics   throughout   the   1960's. 

Daniel's   basic    idea   was   a   variant    on   the    then  well-established   practice  of 
full-normal   probability   plotting    (Chernoff   and   Lieberman    (1954,    1956)).   At- 
tention,   however,    was   focussed    on  the  absolute  values   of    the  treatment   con- 
trasts.     Under   a   null   hypothesis   of   no   treatment    effects,    and   under   a 
Gaussian   error  model,    these  absolute  contrasts    should    be    independently  and 
identically   half-normally   distributed.      If    the  null   hypothesis   holds,    a 
plot   of    the  ordered   absolute  contrasts   against    the   quantiles   of    the   half- 
normal   distribution   should    yield   a    straight-line  configuration   through   the 
origin.      If,    on   the  other    hand,    the  plot    shows   any   of   a  variety   of   pecu- 
liarities   (such  as  acute  non-linearity,    lack  of    near-zero   values,    gaps    in 
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the   plot,    overly-large  values,    etc.),    then  doubt  may   be   thrown  onto    the 
validity    of    the  model   and    of    the  null    hypothesis.      Daniel    suggested    plot- 
ting  the  null   absolute  contrasts    (i.e.,    those  which  did   not   appear   to   cor- 
respond   to   real   treatment    effects)    on  another   half -normal   plot.      From   this 
second   plot,    a   reasonable  graphical   estimate  of    the   experimental    error 
variance  may   be   obtained.      In   this  way,    protection  against   possible  biases 
at    the   estimation   stage   is  more-or-less   guaranteed. 

Although  Daniel   realized    some   time   later   that    half -normal   plots   did 
not   perform  well    in  certain   situations,    it    was   the  appearance  of    this 
paper   which  really  motivated   other   researchers   to   think   seriously  about 
graphical  methods   and    to   develop   them  to    the   state   that    they   exist    today. 

III.       THE   SIXTIES 

Data   analysis  and    statistical   graphics   suddenly   came  alive  during   these 
ten   years.      J.    W.    Tukey's  memorable  paper    (Tukey    (1962)),    in  which  he  dis- 
cusses  his    ideas   on   future  directions   of    the   subject,    set   the   tone   for   the 
next    two   decades. 

At    the   beginning   of    the   1960's,    a    statistical   research  group  was   or- 
ganized  at   Bell   Laboratories,    headed    by  M.    B.    Wilk  and   R.    Gnanadesikan, 
to   develop   data-based    statistical   tools   following    the   principles   laid   down 
by   Daniel   and   Tukey. 

In  a    series   of   articles,    Wilk  and   Gnanadesikan  carried   out   a    system- 
atic   study   of   probability   plotting   procedures,    with   special    emphasis   on 
the   gamma   and   beta   distributions.      These   latter    two   distributions,    unlike 
the   full-   and   half -normal   distributions,    cannot    be   standardized   through 
linear    transformations   to    eliminate  dependence   on   their   respective  param- 
eters.     As   a   result,    it    is   not   possible   to   prepare  general   all-purpose 
gamma   or    beta   probability   paper.      Wilk  and   Gnanadesikan,    however,    devised 
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two-stage   estimation   procedures   from   which   either   gamma   or   beta   plots 
could   be   produced.      Their    ideas   carry   through    in  a    similar  manner    to   other 
types   of   distributions   for   which  probability   paper    cannot    be  constructed. 

Their    initial    interest    in   gamma   probability   plots   derived    from   the  de- 
sire  to   develop   a   workable   generalization   to   the  mult iresponse  case  of 
Daniel's   graphical   analysis   of   uniresponse   factorial   two-level    experiments. 
Each   element    in  a    single  degree-of-f reedom   contrast  vector   can   be  obtained 
through  a    separate   application  of   Yates'    algorithm   to    each  variable   re- 
sponse.     A  common   compounding  matrix    is   then  used    to   construct   a   quadratic 
form   from   each  of    those  n  =    2      contrast   vectors.      Under    suitable  Gaussian 
assumptions,    these   quadratic    forms   should    each  be  distributed   as   weighted 
sums   of   mutually    independent    single   degree-of-f reedom  chi-squared   variâtes 
(typically,    with  unknown  weights);    this   distribution,    fortunately,    is   known 
to    be   reasonably  well   approximated   by   a   gamma   distribution    (with  unknown 
shape   and    scale   parameters) . 

Wilk,    Gnanadesikan  and   Huyett    (1962a,    1962b)    proceeded,    first,    to   de- 
velop maximum-likelihood    estimates   of    the   parameters   of   a   gamma   distribu- 
tion  fitted    to   a   nominated   number,    M,    of    the   K   smallest    quadratic    forms 
(M  <   K<   n)  ,    and   then   to    compute   the   appropriate   quantile.s   of    that   fitted 
gamma   distribution   against   which   the   complete   set    of   n   ordered    quadratic 
forms   could    be   plotted.      Gamma   plots  were,    therefore,    designed    to    (1) 
identify   the   extra    large  values    (associated   with  real  multidimensional 
treatment    effects) ,    (2)    provide   parameter    estimates    that   are  not    inflated 
by   the   presence  of    those  real    effects,    and    (3)    check  distributional   as- 
sumptions  through  linearity   of    the  plot.      An   excellent   description  of    the 
use  of    gamma   plots   for    two-level  multiresponse   experiments    is   given    in 
Wilk  and   Gnanadesikan    (1964).      A  computer    program,    written   by   E.    Fowlkes 
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at   Bell   Laboratories,    to    estimate   the   shape  and    scale   parameters   of   a   fit- 
ted  gamma   distribution   and   also    to   determine   the  gamma   quantiles,    appears 
as   an  Appendix   to    the   book  by   Roy,    Gnanadesikan  and    Srivastava    (1971). 

Following   the  determination   of    (single  degree-of-f reedom)    gamma   plots, 
attention  of    the  group   at   Bell   Laboratories   then   turned    towards   the   beta 
distribution  and    beta   probability   plots.      Special   cases   of    the   beta   dis- 
tribution  include   the   t,    F,    binomial,    and   negative-binomial   distributions. 
In  the   paper    by   Gnanadesikan,    Pinkham  and   Hughes    (1967) ,    maximum   likeli- 
hood   estimation   of    the   two    (shape)    parameters   of    the   beta   distribution  and 
construction  of   beta   plots  was   discussed   based    on   similar   order-statistic 
arguments   as   for   the  gamma   distribution. 

It    is    interesting    to   note  that   most   of    the  applications    in   the  above 
papers   for   gamma   and   beta   plots   centered   around   problems    involving    talker 
identification.      Experiments   on  talker    identification  at   Bell   Laboratories 
during   this  decade  provided    the   research  group  with  a    singularly  rich 
source   of   data   on  which  to   try   out   new   statistical   techniques.      See,    in 
particular,    the  work  of    Becker,    et   a_l    (1965). 

A  general   overview  of    the   state  of    the  art   of    probability   plotting   pro- 
cedures  appeared    in  Wilk  and    Gnanadesikan    (1968).      Probability   plots   were 
characterized    in   that   paper   as    special   cases   of    the  more  general   Q-Q   plots, 
in  which   selected    quantiles,    Q-i(p)   =    FJ    (p)  ,    of   a   reference  cumulative  dis- 
tribution  function    (cdf)    F-,    are  plotted   against    the  corresponding   quantiles. 
Qo(p)    =    Fo    (p)  >    °f   a   comparison   cdf    F~ ,    for    0<    p    <  1.      The   two  distribu- 
tions   involved    in   the  plot    could    be   either    theoretical   or    empirical.      Thus, 
the   empirical   cdf   of   one   sample  can   be   compared    to    the   empirical   cdf   of   a 
second    sample;    if    the  underlying   distributions   of    the   two    samples   are   iden- 
tical,   then   the   plot    should   yield   a    straight-line   configuration.      Acute 
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departures  from  a  straight-line  might  indicate  distributional  differences 
in  location,  shape  and  other  characteristics,  such  as  tail  behavior.  Plot- 
ting the  empirical  cdf  of  a  single  sample  against  the  theoretical  quantiles 
of  a  particular  distribution  corresponds  to  the  usual  probability  plots. 
Investigations  of  convergence  of  one  theoretical  cdf  to  another  can  also  be 
made  through  a  Q-Q  plot,  in  which  the  two  sets  of  theoretical  quantiles  are 
plotted   against    each  other. 

A   study  of    the   possible   shapes   of   Q-Q   plots   when   the  reference  distrib- 
ution  is    standard   Gaussian  was   carried   out    by   Blake  and   Fowlkes    (1966). 
The   shapes   of    these  plots   depend   on  whether    the  comparison  distribution 
function    is    symmetric     or     asymmetric.      If    the   comparison  distribution 
function    is  unimodal   and    symmetric    (about    zero,    say),    then   the   Q-Q   plot 
will   be   S-shaped  with   the   two    semi-circular    lobes   of    the   S    symmetric   about 
a   point   of    inflection   at    the  origin.      The   shape   of    these   lobes   depends   on 
whether    the  comparison  distribution   function   has    shorter   or    longer   tails 
than   the   standard   Gaussian.       If    the  comparison  distribution   is  unimodal 
but   asymmetric,    then   the   Q-Q   plot   will   be   strictly   convex    (or   concave)    with 
respect    to    the   horizontal   axis.      One   possible  use  of    such  a    study  would    be 
to   provide   the   reader   with  a   "dictionary"   of   alternatives   to   the   standard 
Gaussian   in   the   event   that,    say,    a    full-normal   probability   plot   failed   to 
deliver   a    straight    line.      Similar    studies   were  also  made   for   the  cases   of 
uniform  and    exponential   reference  distribution   functions. 

Research   into   residual   analysis,    both  from   a  multiple   regression  and 
from  a   two-way   table   fit,    was   discussed   at    length  by   Anscombe    (1960), 
Tukey      (1962) ,    and   Anscombe  and   Tukey    (1963) ,    with   emphasis   on   outlier   de- 
tection.     Graphical  methods    suggested    in   those   papers    included   various 
types   of   residual   plots:       (i)    plotting   residuals   against   various    independent 
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variables;     (ii)    plotting   residuals   against   fitted   values;    and    (iii)    full- 
normal   probability  plots   of    ordered   residuals    (abbreviated    to    FUNOP) .    These 
plots   provided   visual   checks   on  possible  non-normality,    non-addit  ivity ,    and 
heterogeneity   of    error   variances,    and   also   allowed   any  aberrant   values   to 
be  readily    identified. 

Some  additional   papers   discussed    in  detail   the  use   of   graphics    in   the 
analysis   of   multivariate  data.      Gnanadesikan  and  Wilk    (1969)    presented   a 
review  of   different   aspects   of    scatterplotting  multivariate  data,    as   well 
as   further    illustration  of    the  use   of    Q-Q   plots.      The   basic   gamma   plotting 
idea   was   also    extended   by  Gnanadesikan  and   Lee    (1970)    to    equal    (greater 
than   two)    degree-of-f reedom   situations;    these   included    (i)     (simultaneous) 
comparisons   of   all  main   effects,    or   all    interactions   of    the   same   order,    in 
a  mult iresponse   factorial    experiment   with  all   factors   at    the   same  number 
of    levels,    and    (ii)    comparison   of   within-group   covariance  matrices    in  a 
discriminant   analysis    setting.      The   statistics   proposed   for    these  plotting 
purposes   were   either    the  arithmetic  mean   or   the  geometric   mean   of    the  non- 
zero   eigenvalues    (i.e.,    measures   of    size)    of    the  various   covariance  matri- 
ces;   under    suitable  Gaussian  assumptions,    the  distributions   of    these   sta- 
tistics  are  well   approximated    by   gamma   distributions,    and   hence,    can   be 
gamma   plotted.      A  further   paper,    by   Gnanadesikan  and  Wilk    (1970),    described 
the   contribution   of   generalized   probability   plots   to   analysis   of   variance 
situations    in  which   the  mean   squares    have  differing   degrees   of    freedom. 
(Their    complicated  method   of    calculating    the  conditional    expected  values 
of   order    statistics   was,    however,    simplified   a   great   deal   by  Hastings 
(1970).) 

Graphical   displays   also   played   a   part    in  multidimensional    scaling   tech- 
niques,   developed    by   L.    Guttman   and    at    Bell   Laboratories    by   J.B.    Kruskal, 


R.N.  Shepard,  and  J.D.  Carroll.   Various  types  of  two-dimensional  plots 
were  used  by  them  to  obtain  estimates  of  dimensionality  in  the  reconstruc- 
tion of  "maps"  in  Euclidean  space,  given  the  matrix  of  interpoint  distances. 
See  Kruskal  (1964a,  1964b),  Shepard  and  Carroll  (1966),  and  Shepard  (1974), 
and  the  references  therein.   Since  its  appearance,  this  field  has  been  grow- 
ing tremendously  as  an  area  for  psychometric  research.   See  also  Shye  (1978) 
for  further  discussion. 

At  the  beginning  of  the  1970' s,  statistical  research  at  Bell  Labora- 
tories changed  direction.   It  is  a  mark  of  their  achievement,  however,  that 
the  research  initiated  and  carried  out  there  during  the  I960' s  produced  ma- 
jor contributions  to  the  foundations  of  statistical  graphics.   It  is  also 
worth  noting  that  little  systematic  attempt  was  made  outside  of  Bell  Lab- 
oratories during  the  I960' s  to  develop  alternative  graphical  displays  for 
statistical  data  analysis.   Exceptions  to  this  include  the  work  of  Anscombe 
mentioned  above,  and  the  novel  applications  of  Daniel's  half-normal  plots 
to  the  analysis  of  multidimensional  contingency  tables  (Cox  and  Lauh  (1967), 
Fienberg  (1969))  and  to  the  analysis  of  large  correlation  matrices  (Hills 
(1969)).   The  work  of  R.  Bachi  (1968)  on  his  graphical  rational  patterns 
should  also  be  mentioned  here,  together  with  that  of  J.  Bertin  (1967).   See 
also  Healy  (1968),  who  developed  methods  for  multivariate  normal  plotting. 

IV.       THE   SEVENTIES 

Primary  research  into  graphical  techniques  during  this  decade  shifted 
away  from  probability  plots  and  towards  methods  for  analyzing  multiresponse 
data,  and  especially  for  detecting  multiresponse  outliers.   Robustness  of 
statistical  procedures  was  also  a  key  area  of  concern. 

One  area  in  particular  seemed  to  offer  excellent  opportunities  for  the 
use  of  graphical  displays  —  cluster  analysis.   To  complement  the  growing 
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development   of   clustering   algorithms    (see,    for   example,    Hartigan    (1975)), 
pictorial    (two-dimensional)    representations   of  multivariate  data    started 
to   appear    in   the   statistical   literature.      These   included   the  function  plots 
of    D.F.    Andrews    (1972)    and    the   cartoon   faces   of   H.    Chernoff    (1973). 

The   idea   behind    Andrews'    function   plots   was   to   replace   each  r-vector- 


valued   observation,    x'    = 


x1,  x2,  ...,  xr 


,  by  a  curve  on  the  interval 


(-  w  ,    +7T  );  the  curve  is  constructed  as  a  linear  combination  of  the  coord- 
inates of  x,  the  coefficients  of  which  are  chosen  to  be  orthonormal  func- 
tions of  a  single  variable  t.   Andrews  suggested  using  the  following  func- 
tion: 

f  (t)  =  X-,  /  v2  +  x  sin  t  +  x„cos  t  +  x,  sin  2t  +  xccos  2t  +  .  . . 
x       1  f      2         3         4  D 

This  function  is  then  plotted  against  values  of  t  in  the  interval  (-7r,  +w) 
for  each  multivariate  observation.   A  function  plot  of  all  curves  derived 
from  the  data  set  is  used  to  identify  clusters  of  multivariate  observations. 
Similar  curves  appear  as  a  band  across  the  function  plot,  and  so  the  ob- 
servations corresponding  to  the  curves  making  up  a  particular  band  are  sub- 
sequently treated  as  an  individual  cluster.   Different  bands,  therefore, 
get  associated  with  different  clusters.   The  plots  are  also  used  to  pick 
out  multivariate  outliers  in  the  data.   In  order  to  absorb  maximum  visual 
information  from  the  graph,  Andrews  suggests  plotting  only  about  ten  such 
curves  per  graph.   Otherwise  the  plot  tends  to  look  cluttered.   For  examples 
of  such  function  plots  applied  to  data  from  physical  anthropology,  where 
typically  the  number  of  dimensions  are  high  and  species  of  primates  form 
natural  clusters,  see  Oxnard  (1975).   Extensions  of  this  idea  may  be  found 
in  Andrews  (1973,  197  6),  and  in  Tukey  and  Tukey  (1977). 

Chernoff  s  transformation  of  a  multivariate  data  set  into  a  collection 
of  cartoon  faces  was  an  inspired  attempt  to  exploit  the  feeling  that  people 
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recognize  human-type  faces  very  quickly  and  can  group  like  faces  together 
without  much  hesitation.   Therefore,  each  variable  is  associated  with  a 
particular  facial  characteristic  —  shape  of  upper  face,  of  lower  face, 
location  and  shape  of  eyes,  length  of  nose,  curvature  of  mouth,  etc.   This 
technique  was  illustrated  on  a  variety  of  data  sets  by  Chernoff,  by  Everitt 
and  Nicholls  (1976),  Fienberg  (1979),  Wang  (1978),  and  by  others.   Accord- 
ing to  Wang  (1978),  such  types  of  faces  are  currently  being  used  as  follows: 
by  broadcasting  networks  to  predict  the  results  of  televised  football  games; 
and  by  others  to  investigate  trends  in  U.S.  Supreme  Court  decisions;  to  mod- 
el Congress;  to  characterize  entire  newspapers;  to  aid  in  iconic  communica- 
tions and  psychiatric  screening  of  patients;  and  to  develop  urban  regional 
indicators  that  measure  quality  of  life.   These  displays  can  be  implemented 
for  up  to  18  variables  by  the  FACES  command  in  the  TROLL  system  of  programs 
Welsch  and  Bjaaland  (1974)). 

Such  types  of  semigraphical  displays  are  by  no  means  new.   E.  Anderson 
(1960)  had  already  proposed  the  use  of  glyphs  and  metroglyphs  for  two-di- 
mensional displays  of  multivariate  data.   Thus,  a  multivariate  observation 
would  be  represented  by  a  dot  with  rays  of  various  lengths  emanating  in 
different  directions  from  that  dot.   Each  ray  indicated  a  particular  var- 
iable, and  the  length  of  any  ray  was  a  visual  display  of  the  value  of  that 
variable.   Anderson  suggested  that  best  results  are  achieved  when  at  most 
seven  rays  are  drawn  in  the  picture.   It  is  also  possible  to  increase  the 
dimensionality  represented  by  the  metroglyphs  (without  cluttering  up  the 
graph)  by  plotting  the  metroglyphs  as  points  in  a  scatterplot.   Chernoff' s 
cartoon  faces  are  a  variant  on  this  theme.   Alternative  representations 
along  these  lines  include  stars  (Friedman,  et  al  (1972),  Welsch  and  Bjaa- 
land (1974)),  trees  (Kleiner  and  Hartigan  (1978)),  and  symbolic  matrices 
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(Cleveland,    et  al    (1978)).      The  PQ-plots  of   Diaconis    (1978)    are  of    inter- 
est  here  and   are  related    to    the   linked  views   of    Tukey   and   Tukey    (1977). 

Most   of    these  attractive    ideas,    however,    suffer    to    some  degree   because 
the   clusters   produced    by  visual    inspection  of    the  above   types   of   displays 
are  not    independent   of    the  particular   permutation   of   coordinates  used    to 
produce   those  displays.      This   point   was   demonstrated    for   the   case  of    the 
faces   displays    in  an   experimental    setting   by   Chernoff   and    Rizvi    (1975). 
It   would   certainly   be  desirable   to   have   some   type   of   permutât  ion- invariant 
representation   for   clustering  multivariate  data,    but    this    is   probably  very 
difficult    to   realize    (however,    see  Kleiner   and   Hartigan    (1978)).      The   best 
policy   for   any  of    the  above   types   of   representations    is   probably   to   repeat 
the  visual   clustering   a   number   of    times  using   different   associations   of 
coordinates    (variables)    with   the  graphical   characteristics. 

Additional    semigraphical   displays   appeared    in   a   three-volume  work    (1970), 
a   paper    (1972),    and   a   book    (1977)    on   exploratory  data  analysis   by   J.W.    Tukey. 
New   ideas   and    terminology   for   displaying  various    summary    statistics   of   bat- 
ches  of   numbers   were  presented    there  and   have   rapidly   become  a   routine   part 
of    exploratory  data   analysis.      Concepts    such  as   box   plots    (and    their   deriva- 
tives,   box-and-whisker   plots   and    schematic   plots)    and    stem-and-leaf    plots 
are  now  taught    in    introductory   courses    in   statistics   as    if    they  always   be- 
longed  there.      Some   further   work  on  box   plots  appears    in  a   recent   paper   by 
McGill,    Tukey,    and   Larsen    (1978).      Tukey  also    introduced   a   number   of   dis- 
plays  for    the  residuals   from  a   fitted   two-way   table  analysis;    these   included 
two-way  plots   and   diagnostic   plots.      Related   work  to    stem-and-leaf   plots 
can   be   found    in   the  double-dual   histograms  of    Dallai   and    Finseth    (1977)    and 
in   the   suspended   rootogram   of    Tukey    (1972)    (see  also   Wainer    (1974)). 

Graphical    schemes    intended    to    improve   the  visual    impact   of   a    scatterplot 
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of    two  variables,    X  and   Y,    were  presented   by   Tukey    (1977)    and   by   Cleveland 
and   Kleiner    (1975) .      Both   superimposed    three  curves   on   the    scatterplot    to 
chart    the   change   in   the  distribution  of   Y   given   X.      Tukey' s   display    showed 
a    smoothing   of    the  medians   and   upper   and    lower   hinges    (or,    quart iles)    of 
the  Y-values   for   given   X-values;    the  resulting   curves  were   termed    the  mid- 
dle   (median)  trace  and   upper   and    lower    (hinge)    H-traces   respectively.      An 
extension   of    this    idea   to    finer   partitions   of    the  Y-values    is   called   delin- 
eations  by   Tukey.      In   Cleveland   and   Kleiner's    scheme,    the  three   curves   are 
called  moving  midmeans,    moving   lower    semi-midmeans,    and  moving   upper    semi- 
midmeans   of   Y   given   X.      The  midmean    is   the  average  of   all   observations   be- 
tween  the   quartiles,    and    the   lower    (upper)    semi-midmeans    is    the  midmean  of 
all   observations   below    (above)    the  median.      The  word    "moving"    is  used    in  a 
similar   context   as    in   "moving-average".       Cleveland   and   Kleiner   used   their 
moving    scatterplot    technique   to   obtain   a   better   analysis   of   air-pollution 
data   from   the   East   coast.      See  also    the   paper   by   Cleveland,    et   al    (1976). 
A  different   type   of    two-dimensional   plot,    termed    the  biplot ,    was   con- 
ceived  of    by   Gabriel    (1971)    for   displaying   the    structure  of    large  data  ma- 

(2) 
trices.      In  many   applications,    a   rank-two  matrix,    Y  say,    is  a   reasonably 

good   approximation   to   the    (nxp)-data  matrix  Y.      Since  Yvw    =   GH,    for    some 
(2xn) -matrix   c'=  [g]_,     ...,    g    1  and    some    (2xp) -matrix  H'=  [h-^,     ...,    hi   , 
the  n  +  p   two-dimensional   vectors,    gi,     ...,    g      and    h-^,     ...,    h    ,    may   be 
treated   as   "row   effects"   and    "column   effects"   respectively   of    the  matrix 
y(2)}    and   hence  by    extension,    of    the  data  matrix  Y.       (A  suitable  metric 
may   be  used    to   ensure   that    the   factorization  YK    J    =   GH',    and    hence,    its 
associated   biplot,    is   unique.)      The   biplot,    then,    is   a    simultaneous    scat- 
terplot   of   all   those  n  +  p   two-dimensional  vectors.      Applications   to   prin- 
cipal  component   analysis    (leading    to   a   principal   component    biplot)    and    to 
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the   analysis   of    two-way   tables    (Bradu   and   Gabriel    (1978))    are   specifically 
considered.      A  rank-3   approximation   to  T   and   associated   displays  are  de- 
scribed  by  Weber    (1978). 

A  number   of    papers   on  regression  analysis   appeared   during    the   1970' s, 
many   of    them   emphasizing   the  role  of   graphical  displays.      It    is   convenient 
to   distinguish  three   kinds   of   graphics   for   regression   situations:      those 
concerned   with   estimating    the  regression  coefficients,    those  concerned   with 
subset    selection  procedures,    and   those   concerned   with  residual   analysis. 

Graphical   aspects   of    the   estimation  of    the  regression  coefficients    (3 

2 
in   the   linear   regression  model   y  =   X/3  +   e,    E(e)    =    0,    Cov(e)    =  a     I    , 

e  n 

were  discussed   by  Hoerl   and   Kennard    (1970a,    1970b)    and    by   Denby   and   Mallows 
(1977).      Both  approaches   studied    the   sensitivities   of    specific    estimators 
of   (3,    by    introducing   an  additional   parameter    into    the  model. 

Hoerl   and   Kennard ' s   approach    (initially   proposed   by  Hoerl   during   the 

I960' s    in  the   context   of    chemical   engineering)    was   termed   "ridge  regres- 

A  f  , 

sion".      They   proposed  using   a   "ridge   estimator",  /3  (k)    =■    (X'X  +  kl    )      X'y, 


where   k>0  represents  an  additional   parameter   to    be    estimated,    in   place  of 
the  ordinary   least-squares   estimator , /3  =    (X  X)      X  y,    of  p  .      Thus,    the  param- 


eter  k  reflects   a   perturbation  of    the    (possibly    ill-conditioned)    matrix   X  X. 

A 
Although  /3(k)    is   a   biased    estimator   of  /3  (f or  k  >  0)  ,    it    has   been   shown   that 

A  A 

/3(k)  can  perform  better  than  (3  in  terms  of  mean  square  error  if  k  is  chosen 

correctly.   Consequently,  a  number  of  formal  and  informal  rules  have  been 

suggested  (and  compared)  for  estimating  k.   A  graphical  method  of  viewing 

the  effect  of  k  on  the  ridge  estimator  is  to  plot  the  p  individual  compon- 

A  A 

ents,  /3  .(k),  i  =  1,  2,  ...,  p,  of  the  vector  (3  (k)  on  the  same  graph  against 

the  associated  values  of  k,  perhaps  together  with  the  residual  sum  of 

A    ,        A 
squares  function  ^(k)  =  (y  -  X/3(k))  (y  -  X/3(k)).   Such  a  display  is  called 

a  ridge  trace  plot,  and  an  estimate  of  k  is  obtained  from  that  plot  by  using 
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the   smallest   value  of    k  for   which  the  regression   coefficient    estimates 
"stabilize"    (i.e.,    no   rapid   change    in   the   coefficients    takes   place  as   k 
moves   away   from  zero).      Certain   reservations,    however,    have   been  voiced   re- 
garding  the   ridge  regression   technique,    and    it    is   generally   advised    to   use 
it   with  caution.      See,    for    example,    Thisted    (1976). 

Two   diagnostic   displays   were  proposed   by   Denby   and   Mallows   of   Bell 

Laboratories   for    studying   the   effect   of   varying   a    "trimming   parameter" 

A 
h    (0<h<»)    on  Huber's   robust   M-estimator,  /3(h),    of    the  regression   coeffi- 
cients     fi,     The  M-estimator    is   obtained    by   successively   reapplying   weight- 
ed   least    squares   to   an   initial   estimator   of  j8 ,    in   which   the  weights   depend 
on   the  residuals.      The  weight    function  was   chosen   to   be 

W    (t)   =   1  for   1 1|<  h 

-  h/ |t  |  |t|>h, 

where   h  denotes   the  number   of    points   trimmed    from   the   estimation   stage.      A 
discrete   set   of    integer   values   of    h,    hr<  h         <...<h    <h-,  ,    are  considered 

Lr  (t—  1     —  —         Z     —        -L 

for   application  of    the  plotting  method.      The   first   display   plots   elements 

A 
of   /3(h)    against    h    ,    g   =    1,    2,     ...,    G;    the    second   display    plots    the    indi- 
cé g 

vidual  residuals,  r.(h  ),  against  h  ,  g  =  1,  2,  . . . ,  G.   For  both  plots, 

i  g  g 

it  is  recommended  that  successive  points  be  joined  up  to  obtain  plots  of 
/3(h)  and  jr-(h)}  against  h  for  tu,  <  h  <  h,  .   These  displays,  called  Huber 
traces  by  Welsch  (197  6),  then  provide  a  useful  means  of  detecting  outliers 
in  the  data,  if  they  occur,  and  their  influence  on  the  estimation  proce- 
dure; they  also  give  a  pictorial  view  of  the  sensitivity  (and  hence,  sta- 
bility) of  regression  coefficient  estimates  and  residuals  under  various 
trimming  conditions. 

As  an  aid  for  subset  selection  problems  in  regression,  a  graphical 
method  of  comparing  fitted  equations  was  given  by  Mallows  (1973),  based 
on  the  Cp  statistic.   This  idea  was  initially  formulated   by  Mallows  and 
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Daniel   in  1963.   See  Daniel  and  Wood  (1969).   C   is  defined,  for  a  partic- 

P 

ular   value  of   p,    as    (2p   -   n)    +  SSEp/<j     ,    where  p    is   the  number   of    indepen- 
dent  variables    in  the  given    subset,    n   is   the   total   number   of    cases,    SSE 
is   the  residual    sum  of    squares   from   the  regression  of    the  dependent  varia- 
ble   on   the  given   subset    of   p    independent   variables,    and  u ^    is  an   estimate 
of    the   error  variance  calculated   from  all   r    independent   variables   consid- 
ered  for    inclusion   in  the   final   regression   equation.      Since   each  variable 
can   either    be    included    in   or   omitted   from   the  regression   equation,    there 
are  a   total   of    2r   possible   subsets   to   consider;    for    each  of    these   subsets, 
a  value  of    C      can   be  calculated,    p   =    0,    1,    ...  ,    2r.      A   C   -plot,    then, 
a   plot   of    Cp  versus   p,    and   a   choice  of   optimal    subset    is   that    subset    for 
which   Cp    is   lowest   on   the  plot.      Such  a   choice  may  not    be  unique,    since 
several  different    subsets  may   have  very  close  values  of    C    .      Furthermore, 
even    if   r    is   relatively   small,    performing    2r   regressions   can   be   somewhat 
disconcerting   to   the  user.      However,    due   to   an   efficient   algorithm   by 
Furnival   and  Wilson    (1974) ,    it    is  no   longer   necessary   to   compute  all   2r 
possible   regressions   and    their   associated    C     values;    instead,    the  algorithm 
decides    in  a   sequential  manner   which  regression   to   compute  next   on   the  ba- 
sis  of    its    C  -value    (typically,    subsets  with   C    <  p   are  considered   as   cand- 
idates  for    the   final    selection)    and    then   those   subsets   with,    say,    the   smal- 
lest   5   or    6   C   -values   can   be    immediately   printed   out.      Consequently,    the 
need   for   plotting   all    C     values   has  more-or-less   disappeared. 

Many   papers   on  residual   analysis   and   outlier   detection  appeared   during 
the   1970' s,    and   of    those,    the   following   dealt   with  graphical   displays. 
Larsen  and   McCleary    (1972)    proposed   using   partial   residual   plots   to   assess 
the    importance  of    specific    independent   variables   to   the  regression    in  the 
presence  of   all   other    independent   variables.      Cleveland,    et   al    (1978) 
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suggested  plotting  the  absolute  regression  residuals  both  against  the  inde- 
pendent variables  and  against  the  fitted  values;  they  also  suggested  that 

?    A  2 
a  visual  display  of  R  or  a       could  be  obtained  by  plotting  the  observed  re- 
sponses against  the  corresponding  fitted  values.   Gamma  plots  of  squared 
residuals  (or,  equivalently,  half-normal  plots  of  absolute  residuals)  were 
suggested  by  Gnanadesikan  and  Kettenring  (1972)  for  regression  residuals, 
and  by  Gentleman  and  Wilk  (1975a,  1975b)  for  residuals  from  a  two-way  table 
fit.   Graphical  analysis  of  residuals  from  censored  data  is  given  by  Nelson 
(1973)  using  methods  based  on  hazard  plotting  (see  below) .   Some  discussion 
of  peculiarities  in  probability  plots  of  residuals  is  given  in  Andrews  and 
Pregibon  (1977),  who  instead  suggested  using  augmented  residual  plots  which 
are  more  sensitive  to  particular  forms  of  departures  from  the  assumed  model. 

At  a  different  level,  Andrews  and  Tukey  (1973a,  1973b)  outlined  a  meth- 
od of  compressing  much  of  the  information  in  a  scatterplot  or  in  a  full-nor- 
mal plot  of  residuals  (or  of  any  other  unstructured  batch  of  numbers)  into 
a  six-line  plot  for  use  on  a  teletypewriter  or  similar  device.   Although 
these  types  of  printer  plots  are  a  little  difficult  to  interpret  at  first 
glance,  and  although  multiple  points  cannot  be  plotted  at  the  same  position 
due  to  the  symbols  used,  they  do  fulfill  an  obvious  need  for  fast  genera- 
tion of  graphs  in  an  interactive  regression  analysis  system. 

Review  articles  on  multivariate  residual  analysis  and  outlier  detec- 
tion were  presented  by  Gnanadesikan  and  Kettenring  (1972)  and  Gnanadesikan 
(1973),  in  which  the  emphasis  was  placed  on  graphical  analysis  using  scat- 
terplots  of  pairs  of  the  first  (or  last)  few  principal  components,  and  gamma 
plots  of  suitably  chosen  quadratic  forms  in  the  multivariate  regression  re- 
siduals. 

Two  graphical  methods  of  determining  the  dimensionality  of  a 
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multivariate  regression  were   given   by   Izenman    (1980) .      The  problem   is   to   as- 
sess  the   statistical   rank  of    the  regression   coefficient   matrix   C   for    the 
multivariate  regression  model   Y   =  M   +   CX   +   e.      The   first  method   uses   a   plot 
of    the  rank  trace,    in  which  a   point    is   plotted   corresponding   to    each  possi- 
ble rank  of    C.      In   this   plot,    the   horizontal   coordinate   is  a    function  of    the 
difference   between  a   "reduced-rank"   regression  coefficient   matrix  and    the 
full-rank  regression  coefficient   matrix,    while   the  vertical   coordinates    show 
the  proportionate  reduction    in  the  amount   of    residual  variation   from  using 
a    simple   full-rank  model   rather    than   the   computationally  more   elaborate  re- 
duced-rank model.      Typically,    these   points   lie    inside   the  unit    square,    and 
when   successive   points   are  joined  up,    the   rank  trace   is   obtained.      The   sta- 
tistical  rank  of    C   is   then   the   smallest   rank  for   which   the  corresponding 
plotted   point    is   approximately   zero.      The    second  method   uses   an   ordered    se- 
quence of   gamma   plots   of    the  residual     vectors   derived   from  all   possible  re- 
duced-rank regressions.      As   long   as    the   "true"   statistical   rank  of    the  re- 
gression  coefficient   matrix    is   larger    than   those  values   of    the  rank  being 
considered,    the  corresponding   gamma   plots    should   differ  markedly   for   differ- 
ent  rank  values.      When   the  rank  reaches   this    true   rank,    the  plots   should 
cease   to   change  and    should   become  more   stable.      These   two   graphical   proce- 
dures  can  also   be  used    to   determine   the  number   of  principal   components   or 
pairs   of   canonical  variâtes   to  use   in   any   given  application. 

Some  additional   work  concerning   probability   plotting  methods   did   appear 
during    the   1970' s.      Zahn    (1975a,    197  5b)    presented   a   corrected   and  modified 
version   of    Daniel's   half-normal   plot;    major   corrections   to    the  original   cri- 
tical  values  and   plotting   positions   were  worked   out,    while  a  minor  modifica- 
tion was  made   to   the  nomination  procedure   so    that   only   the   smallest   70%   of 
the   contrasts  not   declared    significant   were  used    to   construct    the   final 
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estimate  of    the   error  variance.      Daniel    (1976,    p.    149),    however,    nearly 
twenty   years  after    the  appearance  of    his   1959   paper   on   half-normal   plotting, 
remarked    that,    for   certain  applications,    "the   signed   contrasts    in   standard 
order   have  more    information   in   them   than  do   the  unsigned   contrasts   ordered 
by  magnitude";    he  noted    that    this   occurred   whenever   peculiarities   discovered 
in   the  data   were   all    strongly   sign-dependent,    and   were  properties   of    spe- 
cific   subsets   of    the  data    set. 

Probability   plotting  methods   for   the  Weibull   distribution  were  devised 
by   Nelson   and    Thompson    (1971),    to   be  used    in   life-testing    situations;    see 
also   King    (1971) . 

Hazard   plots   for    the   graphical   analysis   of   multiply   censored    life   data, 
which  consist   of    times   to    failure  on   failed   units,    intermixed   with    (random) 
censoring    times   on   unfailed   units,    were  presented    by  Nelson    (1972) ,    who   gave 
the  necessary  nomenclature,    theory  and   applications   of   hazard   plotting   to 
the   exponential,    normal,    lognormal,    extreme-value,    and   Weibull   distributions. 
The   hazard    function    (also   known  as    instantaneous   failure  rate,    force  of   mor- 
ality)   corresponding    to   a   particular   cumulative  distribution   function   F(x) 
with  associated   probability  density   f  (x)  ,    is   defined   as   h(x)    =    f(x)/(l   -   F(x)), 
and    the    integral   of    h(t)    up   to    time  x    is   H(x)    =   -log    (1   -   F (x) ) ,    the   cumu- 
lative hazard    function.      Hazard   plotting   papers  were   constructed    so   that   the 
relationship   between  H(x)    and   time  x    is    linear,    with   the  probability    scale 
on  the   hazard   paper   exactly   the   same  as   that   on   the  corresponding   probabil- 
ity  paper.      The   hazard   plot,    which  can   be   interpreted    in   the    same  way   as   a 
probability   plot,    is  used    to    estimate   such  unknowns   as   distributional   param- 
eters,   the  proportion  of   units   failing   by   a   given   age,    percentiles   of    the 
distribution,    the   behavior   of    the   failure-rate  of    the  units   as   a   function 
of    their   age,    and    conditional   failure  probabilities   for   units   of   any   age. 
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It    is   worth  noting   the  recent   appearance  of   a   number   of    books   on  gra- 
phical methods,    such  as  King    (1971),    Gnanadesikan    (1977),    Everitt    (1978), 
and  Wang    (1978a),    which   include  discussion  and   details   of  many   of    the  contri- 
butions  outlined    in   this   survey.      Furthermore,    there  are   some  additional 
books  which  contain  recognition  of    the  usefulness   of   graphical  methods;    see, 
for    example,    Barnett   and   Lewis    (1978),    whose  discussion    is   tailored    to   the 
detection   of    outliers    in   large  data    sets. 

V.       PROSPECTS   FOR    1980   AND    BEYOND 

There  are  two   primary   directions   that    statistical   computer   graphics 
can  go:      graphics    similar    to   the  PRIM-9    system  and   color-coded   graphics. 

PRIM- 9    (for:      Picturing,    Rotation,    Isolation   and   Masking   —    in  up   to 
9  dimensions)    is   an   interactive  data   display  and   analysis    system,    in  which 
interaction    (by   the  user)    is  accomplished    through 

(i)      a    solid-state  alphanumeric    keyboard, 
(ii)      a   light-pen   to   determine  possible  display   options, 
(iii)      a    function  keyboard   with   32    buttons. 
The  main   features   of    this    system  are    (a)    the  ability   to    explore  multidimen- 
sional  data  using   real-time   continuous   rotations   of    the  data   about   any   cen- 
ter  point   and    to   view  the  results   along   any   of    the    (at   most,    36)    two-dimen- 
sional  projection   planes     and    (b)    its   built-in  automatic   projection  pursuit 
algorithm  which   searches   for    those  projections   of    the  data   that   provide   the 
most    interesting    structure. 

So  far,  experience  with  PRIM-9  has  apparently  been  limited  to  multi- 
variate high-energy  physics  data  and  multivariate  discrimination  analysis 
in   pattern  recognition   problems. 

Since   the   hardware  used   to    implement   PRIM-9    is,    at   the  moment,    very 
expensive,    and    clearly   not   yet   available   for   general   use,    an  approximation 
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to    it    has   been   provided   by   CLOUDS,    A  Troll   Experimental   Program  of    the   NBER, 
which  was  designed    by  Welsch  and   Bjaaland    (1974)    based   on  observations   of 
the  PRIM-9   system    (see  also   Welsch    (1976)).      CLOUDS    is   really   a   "discret ized" 
version   of   PRIM-9,    and    is   currently   on   line   through  EDUNET. 

Color-coded  graphics  are,  on  the  other  hand,  already  with  us.  We  are 
now  in  a  third  generation  of  computer  graphics:  first,  the  CALCOMP  pen-and- 
ink  plotter;  then  the  TEKTRONIX  (1973)  CRT  graphics  terminals;  and  now  color 
graphics  terminals,  such  as  the  APPLE  II  (1978)  computer  system  which  utili- 
zes color  TV  sets  as  video  screen  (no  sound)  and  a  special  cassette  recorder 
for   relaying   a    tape's    information    into    the  computer. 

Color   graphics   have   been  used   primarily   by   the  U.S.    Bureau   of    the   Census 
for   displaying    survey   and   census  material;    that    is,    situations    in   which   the 
variables   are  categorical.      This   has    led    to   the   study   of    two-variable  color 
statistical  maps   for   displaying   cross-information   from   two   categorical  vari- 
ables.     Each  variable,    perhaps   on   some   common   geographical    background,    is   first 
color-coded    by  using   different    saturations   of   a    specific    spectrum  color,    and 
then   the   two   resulting   "maps"   are   superimposed   on   one  another   to   produce  a 
map   color-coded   by   saturations   of    the   composite  color.      Empirical    evidence, 
however,    has   shown   that   color  maps   produced    in   this   way   can   be    (at    least, 
initially)    confusing   to   those  not    familiar   with  the  resulting    saturations   of 
the  composite  color.      See   Fienburg    (1979)    for    such  an   example.      Clearly, 
such  color   graphics   displays   will   have   to   be   studied    quite   extensively,    and 
hopefully,    guidelines   for    their    efficient   use  will   appear    in   the   near    future. 
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ABSTRACT 
The  search  for  rules  for  effective  graphical  display,  whether  for  the 
purpose  of  communication,  exploration  or  reconstitution,  has  been  hampered 
by  the  lack  of  a  cohesive  body  of  experimental  evidence  regarding  the  para- 
meters of  efficacious  graphical  display.   To  some  extent,  existing  evidence 
is  diverse,  because  of  the  lack  of  a  coordinating  theoretical  structure  and 
an  allied  unified  graphical  vocabulary.   In  this  paper,  we  present  the  rud- 
iments of  Bertin's  theory  of  graphics  and  his  graphical  semiology.   We  also 
present  two  experimental  studies  of  "Two  Variable  Color  Maps"  which  we  of- 
fer as  examples  of  how  questions  about  the  efficacy  of  a  graphical  form 
can  be  addressed. 

J .  WELTANSCHAUUNG 
Over  the  past  190  years,  from  Playfair  (1787)  to  Tukey  (1977),  the 
development  of  graphics  for  the  analysis  and  communication  of  data  has  pro- 
gressed in  fits  and  starts.   It  is  difficult  to  comprehend  exactly  why 
this  is  so,  especially  considering  how  useful  graphical  methods  are  for 
finding  something  that  you  were  not  expecting.   Perhaps  one  explanation  is 
to  be  found  in  the  literature  surrounding  Neurath's  work.   Otto  Neurath 
(1973)  elaborated  the  unit-symbol  graphical  scheme  (Brinton,  1914)  into  a 
system  he  dubbed   ISOTYPE" ,  in  which  bar  chart-like  graphics  are  made 
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through  the  multiple  use  of  metaphorical  figures  (see  Figure  1  as  an  ex- 
ample) . 
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Figure   1 

A  number  of  detractors  complained  that  the  role  of  science  was  not  to 
"draw  little  men".   When  confronted  with  the  fact  that  these  "little  men" 
made  the  comprehension  of  complicated  data  easier,  the  response  was  a 
continuation  of  Florence's  (1929,  p.  60)  scoffing  that  "the  chief  purpose 
of  this  sort  of  issue  is  to  tickle  the  imagination  of  school  children  and 
adults  mentally  of  school  age,  or  to  impress  the  tired  business  man  or 
the  bored  politician."  The  feeling  expressed  was  that  a  true  scientist 
does  not  need  to  be  spoon  fed  information  like  pablum.    Whether  the 
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information  load  on  modem  scientists  has  gotten  increasingly  heavy,  or 
scientists  have  gotten  intellectually  softer,  is  hard  to  determine,  but 
there  has  been  an  increasing  emphasis  on  graphics  recently  by  some  of  the 
most  impeccable  of  scientific  minds. 

John  Tukey  (1971,  1977)  has  proposed  a  wide  variety  of  graphical 
schemes  for  data  analysis,  from  stem-and-leaf  displays  to  suspended  rooto- 
grams  to  generalized  Box  Plots  (McGill,  Tukey  and  Larsen,  1978).   Herman 
Chernoff  (1971)  has  gone  Neurath  one  better,  and  proposed  cartoon  faces 
for  the  display  of  multivariate  data.   Gnanadesikan  (1976)  has  offered  a 
wide  variety  of  interesting  kinds  of  plots  for  the  analysis  of  multivariate 
data,  and  further  work  in  this  area  is  tumbling  out  of  Bell  Labs  daily.   A 
chronology  of  graphical  developments  was  prepared  by  Beniger   and  Robyn 
(1978) ,  however  the  revival  of  work  on  graphics  rendered  its  coverage  of 
recent  developments  out  of  date  before  their  paper  was  published. 

In  addition  to  work  on  new  graphical  forms,  and  new  uses  for  old 
forms,  machines  have  proliferated  that  make  graphic  production  easier  still. 
There  are  graphics  terminals,  flatbed  and  drum  plotters  (both  monochrome 
and  multicolored),  small  format  desk  top  drawing  machines  and  enormous 
plotters  taking  up  whole  rooms  in  the  Census  Bureau.   Of  course,  caboosing 
the  increased  production  of  hardware  are  the  software  developments  for  au- 
tomated graphics.   The  prospective  grapher  now  need  only  sit  down  and  type 
"BAR  CHART"  and  one  appears.   Equally  important,  the  cost  of  the  equipment 
to  do  computerized  graphics  is  not  very  different  than  the  cost  of  outfit- 
ting a  draftsman's  shop  a  decade  ago. 

Thus  statisticians  have  produced  many  different  kinds  of  pablum,  and 
engineers  new  and  better  spoons,  but  are  we  better  off?   Ehrenberg  (1977) 
believes  that  for  many  purposes  a  graphical  presentation  is  a  waste  of  time, 
and  a  well  designed  table  is  better.   Kruskal  (1977)  urges  the  interaction 
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of  statisticians  and  psychologists  in  the  study  of  graphical  forms  as  a 
problem  in  human  factors.   What  graphics  are  better  for  which  purpose,  for 
whom?   The  history  of  empirical  research  in  the  use  of  graphical  forms  for 
the  communication  of  information  has  been  unsystematic  and  is  not  cohesive, 
with  many  small  studies  and  no  integrating  theory  or  vocabulary. 

The  need  for  integrated  experimentation  on  graphical  forms  and  an  al- 
lied development  of  a  theoretical  structure  has  become  increasingly  appar- 
ent.  These  needs  have  moved  the  American  Statistical  Association  to  con- 
stitute a  committee  on  graphics  to  consider  carefully  the  role  graphics 
has  within  the  disciplines  of  statistics,  and  to  point  the  way  to  further 
exploration.   Fienberg's  recent  paper  (1978)  describes  some  aspects  of  this 
investigation.   In  his  paper  he  describes  the  "Two  Variable  Color  Map"  de- 
veloped by  the  Census  Bureau,  and  offers  some  criticisms.   In  this  paper, 
we  would  like  to  answer  Fienberg's  cry  by  describing  the  rudiments  of  Ber- 
tin's  (1973)  theory  of  graphics  using  his  terminology  (as  translated  by 
Berg  and  Wainer,  Note  1).   We  have  chosen  these  terms  in  the  hope  of  con- 
tributing to  the  establishment  of  a  standardized  graphic  vocabulary.   This 
follows  from  the  contention  that  if  we  do  not  have  a  vocabulary  to  discuss 
graphic  concepts,  those  concepts  will  not  be  discussed  in  an  unambiguous 
manner.   Bertin's  words  are  relatively  well  defined  and  uniquely  related 
to  graphical  concepts. 

In  this  paper,  the  theoretical  structure  and  graphic  vocabulary  are 
oriented  around  a  single  problem  —  the  study  of  Two  Variable  Color  Maps. 
In  the  following  sections,  we  shall  discuss  two  experiments  which  investi- 
gate these  maps.   Although  the  experiments  are  relatively  simply  and  !!! 
straightforward,  they  become  rather  powerful  when  couched  within  a  theo- 
retical structure.   The  results  of  these  investigations  indicate  that 
careful  experimentation  is  crucial  before  a  new  graphical  form  is  adopted, 
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particularly  if  it  is  adopted  by  a  pace-setting  organization  like  Census. 
In  situations  like  this,  a  particular  form  can  become  conventional  simply 
because  of  the  authority  of  its  originators,  and  not  because  of  its  in- 
trinsic merit.   Yet,  in  questions  of  science,  the  authority  of  a  thousand 
is  (paraphrasing  Galileo)  not  worth  the  humble  experiments  of  a  single 
individual. 

The  following  is  a  detailed  description  of  some  of  the  problems  that 
Two  Variable  Color  Maps  were  developed  to  alleviate,  and  two  humble  exper- 
iments which  indicate  how  well  they  seem  to  have  done  it. 

II.       INTRODUCTION 
For  many  years,  statisticians  and  cartographers  have  been  concerned 
with  the  effective  display  of  statistical  information  on  a  geographic 
background.   To  understand  the  problems,  let  us  consider  a  map  display  of 
a  quantitative  data  component.   The  prospective  displayer  has  three  ques- 
tions to  resolve.   If  the  component,  say,  "median  family  income",  is  con- 
tinuous, its  graphical  representation  often  requires  that  it  be  categor- 
ized.  How  many  categories?  What  are  the  boundaries?   How  these  two  ques- 
tions are  answered  is  crucial  to  the  way  the  information  is  depicted.  Con- 
sider breaking  income  into  four  categories  as  follows: 

(1)  under  $  2,000 

$2,000  -  $2,500 

$2,501  -  $25,000 

over  $25,000 

It  would  be  an  unusual  area  of  the  United  States  whose  income  distribution 

would  be  depicted  in  an  interesting  way  with  such  a  categorization.   A 

more  usual  categorization  is: 

(2)  under   $7,000 

$7,000  -  $9,999 
$10,000  -  $14,999 
$15,000  or  over 
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Presumably  this  classification  scheme  yields  classes  which  are  substantive- 
ly interesting,  and  has  sufficient  differentiation  in  the  depiction  to  re- 
flect the  distribution  of  the  information  accurately.   We  shall  term  the 
particular  categories  of  a  quantitative  component  the  "degrees"  of  that 
component.   The  number  of  degrees  is  the  "length"  of  the  component.   Thus 
the  categorization  in  (2)  has  four  degrees;  it  is  a  component  of  length 
four. 

The  third  question  to  be  resolved  is  how  to  depict  the  quantitative 
component  graphically.   There  have  been  many  schemes  proposed  for  doing 
this.   In  Figure  2  is  shown  dots  of  various  sizes;  in  Figure  3,  bars  of 
various  lengths;  Figure  4  dots  of  various  densities;  in  Figure  5  shading 
of  various  darkness,  and  Figure  6  shows  the  same  effect  in  a  stereogram. 


Figure  2 
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Figure  3 
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Figure  4 
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Figure   5 


Figure   6 
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All  of  these  use  a  natural  visual  metaphor  (what  we  shall  call  a  "retinal 
variable")  to  represent  the  underlying  quantitative  component.   It  is  a 
natural  metaphor  in  that  the  increasing  size  of  the  dots  represents  a  nat- 
ural order  which  corresponds  to  the  order  of  the  increasing  size  of  the 
statistical  component.   Consider  how  poorly  a  map  in  which  this  corres- 
pondence was  not  maintained  would  work  (i.e., 

small  dot  -  over  $15,000 

large  dot  =  $10,000  -  $14,999 
medium  small  =  $7,000  -  $9,999 
medium  large  =  under  $7,000 

This  is  so  clearly  foolish  that  we  need  not  consider  it  further.   That  is, 
the  use  of  a  naturally  ordered  retinal  variable  in  a  way  that  is  counter 
to  its  order  is  clearly  a  mortal  error.   A  more  venial  sin  is  to  represent 
an  ordered  component  with  a  retinal  variable  that  has  no  obvious  natural 
ordering.   In  this  situation  the  displayer  requires  the  viewer  to  memorize 
a  legend  and  compare  his  perception  to  that  memory.   This  makes  instantan- 
eous comprehension  of  the  data  structure  difficult,  and  comprehension 
even  after  study  tedious.   An  example  of  such  a  depiction  would  be: 

under  $7,000  =  green 
$7,000  -  $9,999  =  orange 
$10,000  -  $14,999  =  yellow 
$15,000  or  over  -  blue 

One  might  ask  why  anyone  would  use  colors  to  display  badly  what  can  be 
done  far  better  and  more  cheaply  monochromatically.   Indeed,  there  does 
not  appear  to  be  any  good  reason.   However,  colors  do  have  a  natural  or- 
der, the  spectrum,  but  it  is  not  as  visually  compelling  as  the  monochrome 
schemes  illustrated  earlier.   One  could,  however,  use  various  saturations 
of  color  to  reflect  an  ordered  data  component;  this  method,  very  much  akin 
to  various  shadings  in  a  monochrome  scheme,  yields  much  the  same  visual 
impact.   Shown  in  Figure  7  is  a  map  in  which  this  approach  is  tried,  with 
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one  modification;  the  lowest  degree  is  yellow,  and  the  next  three  degrees 
are  three  saturations  of  blue.   This  scheme  seems  to  work  reasonably  well, 
although  it  does  require  a  modest  bit  of  memorization  (yellow=low)  which 
could  have  been  naturally  depicted  by  representing  the  lowest  degree  by  a 
maximally  unsaturated  blue  (white) .   It  also  seems  to  set  the  lowest  cat- 
egory perceptually  apart  from  the  other  three  which  may  not  be  justified 
by  the  data. 

It  would  seem  that  a  variety  of  solutions  exist  for  a  reasonable  dis- 
play of  a  single  ordered  component,  what  we  shall  term  a  "univariate  dis- 
play".  Suppose  we  have  two  statistical  components  that  we  wish  to  display 
simultaneously.   This  was  precisely  the  problem  that  was  addressed  by  the 
U.S.  Bureau  of  the  Census  (1975)  in  which  a  "Two  Variable  Color  Map"  was 
developed  and  described.   Subsequently,  two  variable  color  maps  were  used 
prominently  in  Census  work,  particularly  in  their  short-lived  monthly  mag- 
azine chartbook  of  social  and  economic  indicators,  STATUS.   This  sort  of 
map  displays  one  component  as  shown  in  Figure  7,  and  a  second  with  varying 
saturations  of  red  (see  Figure  8);  these  two  "one  component  maps"  are  then 
superimposed  and  their  colors  mixed  to  yield  the  "Two  Variable  Color  Map" 
shown  in  Figure  9. 
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Figure  7 
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Figure  8 
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Figure  9 
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Let  us  now  empirically  examine  the  efficacy  of  this  display  technique. 

III.       GOOD   FOR    WHAT? 
Graphical  displays  can  be  used  to  supply  answers  to  questions  at  three 
levels  of  sophistication.    These  have  been  termed  (Bertin,  1973  and  Note  1): 

(1)  The  Elementary  Level  -  translating  the  retinal  variable  back  to 

the  quantitative  component  (e.g.,  "What  is  the  median  family 
income  at  this  place?"). 

(2)  The  Intermediate  Level  -  relating  trends  seen  in  the  retinal  var- 

iable to  some  other  informational  component  (e.g.,  "What  is 
the  relationship  between  median  family  income  and  distance 
from  the  inner  city?"). 

(3)  The  Superior  Level  -  comparing  the  entire  structure  of  one  compo- 

nent to  that  of  another  (e.g.,  "How  does  the  geographic  dis- 
tribution of  median  family  income  relate  to  'percent  high 
school  graduates?1"). 

In  this  paper,  we  shall  describe  two  experiments  which  investigate  how 
well  "Two  Variable  Color  Maps"  can  be  read  at  an  elementary  level.   In  par- 
ticular, we  shall  concern  ourselves  with  (1)  how  quickly  and  how  accurately 
one  can  answer  questions  of  the  sort  "What  is  happening  at  this  place?". 
We  shall  also  investigate  (2)  the  ease  with  which  one  can  learn  the  rather 
complex  legend  of  the  "Two  Variable  Color  Map"  and  improve  one's  performance 
with  practice. 

Bertin  (1973)  in  his  treatise  on  graphical  methods  is  clearly  pessi- 
mistic about  the  possibility  of  success  of  these  complex  color  maps,  al- 
though he  presents  no  evidence  other  than  his  own  impressions  (and  the 
reader's)  for  this  conclusion.   He  feels  that  one  cannot  make  16  discrim- 
inations in  color: 

"Color  variation  alone  yields  only  about 
six  selective  degrees.   Beyond  that,  one 
must  use  schemes  involving  monochrome 
figurations"  (Bertin,  Note  1,  p.  325). 

The  second  question  is  of  great  importance,  for  we  have  repeatedly 
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found  that  when  a  variety  of  graphical  forms  are  compared  (Wainer,  1974; 
Wainer  and  Reiser,  1976;  Wainer,  Groves  and  Lono ,  1978)  the  biggest  effects 
are  (1)  individual  differences  among  the  subjects,  and  (2)  learning.   Of 
course,  the  displays  being  compared,  though  structurally  quite  different, 
were  competently  and  unambiguously  constructed.   In  any  case,  the  great 
effect  due  to  learning  tells  us  that  an  unfamiliar  display  may  initially 
fare  badly  in  comparison  with  a  better  known,  but  in  some  sense  inferior, 
one,  but  that  after  it  is  used  for  a  while  its  superiority  will  manifest 
itself  (Wainer  and  Reiser,  1976;  Biderman,  1976).   For  the  forms  we  have 
tested  we  have  found  that  performance  usually  seems  to  reach  an  asymptote 
after  only  very  few  exposures.   This  means  that  innovative  graphics  are 
possible,  if  they  only  require  a  modest  training  period  until  viewers  adapt, 
and  then  this  innovative  form  becomes  part  of  the  individual's  graphical 
repertoire.   However,  if  initial  performance  is  poor  AND  it  cannot  be 
learned,  its  institutionalization  should  be  discouraged. 

IV.       EXPERIMENT   1 
In  this  experiment,  we  ask  subjects  "What  is  happening  at  this  place?" 
and  they  answer.   Two  variable  color  maps  are  compared  with  two  corres- 
ponding "One  Variable  Color  Maps".   We  recorded  how  long  it  takes  to  answer 
the  questions,  and  how  many  mistakes  were  made.   Of  course,  had  we  asked 
instead  "Where  is  (x,y)  occurring?"  where  (x,y)  is  some  bivariate  event  I , 
the  two  univariate  maps  would  have  been  a  serious  failure.   To  complete 
such  a  task,  one  would  have  to  locate  all  areas  which  were,  say  yellow  on 
one  map,  and  then  search  for  all  areas  which  were,  say  dark  blue,  on  the 
other.   Then  the  intersection  of  these  two  sets  would  have  to  be  deter- 
mined.  It  is  clear  that  merely  looking  for  bright  green  on  the  two  vari- 
able maps  would  be  far  easier.   Thus  the  superiority  of  the  bivariate  maps 
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to  the  two  univariate  ones  in  this  task  seems  so  obvious  that  no  further 
testing  is  required.   However,  if  an  unacceptably  high  error  rate  is  ob- 
served with  a  bivariate  map  in  identifying  "What  is  happening  at  this 
place?"  it  is  not  likely  to  be  diminished  in  the  inverse  task  (which  has 
this  identification  as  a  subtask) .   Thus,  if  this  error  rate  is  unaccept- 
ably large  in  the  task  we  did  test,  an  alternative  display  ought  to  be 
developed.   To  jump  ahead  for  a  moment,  our  results  indicate  that  the  per- 
formance we  expect  with  these  bivariate  maps  on  almost  any  task  is  a  lot 
like  that  expected  of  a  dog  walking  on  its  hind  legs. 

SUBJECTS 

Sixteen  graduate  students  in  psychology  at  Johns  Hopkins  University 
participated  as  subjects  in  sessions  lasting  about  45  minutes. 

DESIGN 

A  mixed  design  with  one  within  factor  and  one  between  factor  was  used. 
The  within  factor  was  map  type  (bivariate  vs.  univariate)  in  which  all  sub- 
jects received  both  types  of  maps.   The  between  factor  pertained  to  the 
order  in  which  the  subjects  received  the  two  types  of  maps.   Subjects  were 
randomly  assigned  to  one  of  two  groups;  subjects  1  through  8  received  the 
trials  with  the  bivariate  maps  first,  subjects  9  through  16  received  the 
trials  with  the  univariate  maps  first. 

STIMULI 

Examples  of  the  map  types  were  shown  earlier  (Figures  7,  8  and  9). 
These  were  labelled  every  half  inch;  numerically  along  the  y-axis  and  al- 
phabetically along  the  x-axis,  yielding  a  familiar  coordinate  system  for 
maps.   In  addition,  a  monochrome  rendering  of  a  section  of  the  colored 
map  was  also  prepared  with  the  specific  area  in  question  shaded.   Both  the 

95 


letter-numer  coordinate  of  the  approximate  area  and  the  monochrome  render- 
ing were  presented  simultaneously,  to  indicate  to  the  subject  exactly  what 
was  being  asked  about.   A  sample  "question"  is  shown  in  Figure  10. 


H 


H 


5     6     7     8 

Figure  10 
All  maps  were  of  the  same  geographic  area,  but  nine  different  varia- 
bles were  displayed.   In  addition  to  the  nine  distinct  univariate  maps, 
there  were  nine  combinations  of  the  variables  to  yield  nine  different  bi- 
variate  maps. 


PROCEDURE 

Before  the  actual  experiment,  subjects  were  read  instructions  explain- 
ing the  purpose  of  the  project  and  were  introduced  to  the  two  map  types. 
The  two  map  types  were  divided  into  two  separate  booklets  containing  nine 
bivariate  maps  and  eighteen  (nine  pairs)  univariate  maps,  respectively. 
One  pass  through  an  entire  booklet  is  termed  a  "trial".   Each  trial  was 
composed  of  nine  "tests".   An  experimental  test  consisted  of  finding  the 
degree  of  each  of  the  two  variables  on  a  specific  map  type  for  one  partic- 
ular location.   This  yielded  18  responses  per  trial  for  both  map  types. 
Even  though  a  test  with  the  univariate  maps  required  subjects  to  use  two 
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maps,  the  booklets  were  arranged  so  that  the  same  number  of  page  turns  were 
required  to  pass  through  the  booklets  of  both  map  types.   Two  different 
variables  with  only  a  letter  name,  such  as  "Variable  A"  or  "Variable  B" 
were  used  on  each  test.   The  univariate  booklet  contained  exactly  the  same 
variable  pairs  as  were  used  with  the  bivariate  map  booklet.   The  color  key 
indicating  the  degrees  of  the  variables  was  located  on  the  map  sheets  di- 
rectly to  the  left  of  the  maps.   Each  different  variable  pair  actually  rep- 
resented different  variables,  but  the  respective  color  keys  for  both  the 
bivariate  and  the  univariate  maps  remained  the  same  on  each  test. 

On  each  trial,  subjects  were  presented  with  a  new  cue  booklet  and  an 
answer  sheet  containing  eighteen  places  for  the  subject  to  write  in  the  de- 
grees of  each  of  the  two  variables  for  each  of  the  nine  tests.   Subjects 
were  instructed  to  flip  a  page  of  the  cue  booklet  each  time  a  page  of  the 
map  booklet  was  changed. 

The  location  of  the,  specific  area  to  be  reported  was  randomly  deter- 
mined. All  subjects  received  the  same  location  cues.  Since  some  locations 
were  more  easy  to  find  because  of  peculiar  shape  or  location,  the  same  lo- 
cations and  cues  were  used  for  both  map  types.  Each  subject  received  four 
trials  of  each  map  type.  All  four  trials  with  one  map  type  were  completed 
before  the  four  trials  with  the  other  map  type  were  begun. 

Two  dependent  variables,  response  time  and  number  of  errors,  were  re- 
corded.  Response  time  was  recorded  with  a  stop  watch,  from  the  time  the 
subjects  began  the  first  test  until  the  last  response  was  written. 

Three  practice  tests  with  each  map  type  were  presented  to  subjects  be- 
fore the  actual  trials.   These  practice  tests  enable  the  experimenter  to 
determine  if  the  subjects  understood  the  task  instructions.   The  instruc- 
tions emphasized  both  speed  and  accuracy.   Because  subjects  often  commented 
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even  before  the  actual  experiment  that  the  color  discriminations  would  be 
more  difficult  for  the  bivariate  maps,  subjects  were  instructed  to  make 
their  best  guess  if  confused,  and  to  continue  working. 

RESULTS 

The  response  times  for  all  subjects  on  all  trials  are  shown  in  Table  1; 
the  number  of  errors  per  trial  are  shown  in  Table  2. 


Experiment  1  - 


Table  1 
Response  Times  (seconds) 


MAP  TYPE 


Bivariate 

Univariate 

Bivariate 

Trial 

Trial 

Trial 

1 

2 

3 

4 

1 

2 

3 

4 

1 

2 

3 

4 

Subject 

1 

157 

126 

128 

126 

164 

117 

117 

129 

2 

153 

132 

128 

127 

138 

130 

124 

139 

3 

279 

199 

167 

148 

185 

135 

207 

129 

4 

162 

156 

132 

163 

154 

156 

131 

121 

5 

190 

179 

179 

158 

193 

195 

188 

197 

6 

138 

142 

108 

90 

151 

142 

113 

103 

7 

278 

227 

203 

X 

203 

217 

178 

279 

8 

154 

163 

167 

110 

168 

156 

186 

180 

9 

155 

115 

161 

120 

145 

132 

133 

98 

10 

132 

118 

102 

103 

108 

108 

102 

83 

11 

198 

201 

172 

180 

140 

X 

127 

118 

12 

308 

218 

163 

161 

207 

149 

134 

127 

13 

200 

134 

159 

144 

161 

189 

184 

180 

14 

172 

146 

152 

103 

X 

123 

131 

107 

15 

290 

237 

199 

190 

220 

170 

158 

165 

16 

191 

191 

144 

152 

178 

X 

185 

158 
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Table  2 
Experiment  1  -  Errors  per  Trial  (out  of  a  possible  18) 


MAP  TYPE 


Bivariate 

Univariate 

Bivariate 

Trial 

Trial 

Tr 

ial 

1 

2 

3 

4 

1 

2 

3 

4 

1 

2 

3 

4 

Subject 

1 

1 

1 

0 

0 

0 

0 

0 

0 

2 

1 

1 

2 

3 

0 

0 

0 

1 

3 

0 

0 

2 

0 

0 

1 

0 

0 

4 

0 

3 

0 

0 

0 

0 

0 

0 

5 

1 

2 

2 

2 

0 

0 

0 

0 

6 

5 

3 

2 

11 

3 

0 

0 

0 

7 

0 

2 

1 

1 

0 

1 

0 

0 

8 

1 

2 

4 

1 

0 

3 

0 

8 

9 

0 

0 

0 

0 

0 

1 

3 

0 

10 

0 

0 

0 

0 

0 

3 

2 

1 

11 

0 

0 

0 

0 

1 

2 

4 

5 

12 

0 

0 

0 

0 

3 

1 

0 

1 

13 

2 

1 

0 

2 

1 

0 

0 

2 

14 

0 

0 

0 

0 

0 

2 

2 

0 

15 

3 

0 

1 

0 

4 

1 

1 

2 

16 

2 

3 

1 

3 

1 

X 

0 

0 

There  are  missing  response  times  for  four  trials;  in  three  cases  this  was 
because  the  timer  failed  to  operate  properly,  error  counts  were  obtained. 
In  one  case  (Subject  16,  Trial  2)  the  cue  booklet  and  the  map  booklet  be- 
came misaligned  and  neither  response  times  nor  error  counts  were  obtained. 

A  careful  examination  of  the  data  matrices  shown  in  Tables  1  and  2  in- 
dicates quite  clearly  what  has  happened.   We  shall  point  out  the  highlights 
of  the  analysis  without  going  into  too  much  detail  when  the  analysis  was 
standard.   We  shall  give  more  information  when  something  that  was  done  de- 
viated from  standard  practice.   We  did  separate  analyses  of  the  response 
time  and  the  errors.   The  dependent  variables  analyzed  were  the  SQUARE 
ROOT  OF  TIME,  and  ARC S IN (ERRORS/ 18 )5  which  are  reasonably  traditional  var- 
iance equalizing  transformations  for  data  of  these  types.   In  Table  3  are 
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shown  the  F  values  for  the  main  effects;  there  were  a  couple  of  significant 
interactions  of  the  time  variable  (AxD  and  BxD)  which  reflect  that  how 
quickly  one  learns  depends  on  order,  and  also  that  some  people  learn  faster 
than  others.   However,  these  interactions  are  minor  and  do  not  enter  into 
our  main  argument.   The  principal  finding  is  clear;  the  biggest  differences 
in  response  time  are  individual  differences  and  learning  (means  for  Trials 
1  through  4  are  13.4,  12.6,  12.3  and  11.8  respectively),  and  the  biggest 
differences  in  error  rates  were  between  map  types  (bivariate  =  .24,  univar- 
iate =  .08).   Bivariate  maps  are  responded  to  a  bit  more  quickly,  but  re- 
sult in  much  larger  error  rates.   The  significant  effects  due  to  order 
(which  map  type  was  presented  first)  probably  related  to  the  number  of  maps 
looked  at,  in  that  someone  who  views  univariate  maps  first  has  seen  twice 
as  many  maps  when  he  begins  work  on  bivariate  maps  than  have  been  seen  by 
bivariate-f irst  subjects  when  they  begin  work  on  univariate  maps.   Also, 
it  affords  this  subject  the  opportunity  to  practice  a  more  familiar  task 
before  going  to  the  less  familiar  bivariate  maps. 


Table  3 
F  Ratios  from  an  Analysis  of  Variance  of  Experiment  1 


Factor  df        Sin  x  1   18  \  Time 


df 

/ Errors 
Sin-1  y       18 

1 

.9 

14 

2.9 

1 

34.9 

3 

.7 

r 


A.  Order  1  .9  15.3 

B.  Subjects  14  2.9  25.6 

C.  Map  Type  1  34.9  7.8 

D.  Trials  3  .7  30.6 
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V.       EXPERIMENT   2 
The  second  experiment  is  virtually  identical  to  the  first  in  procedure. 
Our  aim  was  to  discover  to  what  extent  the  legend  had  been  internalized  by 
the  subjects,  since  its  internalization  is  clearly  necessary  for  an  easy 
reading  of  the  map  at  higher  levels.   Several  weeks  after  the  first  experi- 
ment, eight  of  the  same  subjects  were  re-recruited  and  assigned  to  one  of 
two  experimental  groups  as  before.   The  experimental  procedure  was  repeated 
exactly,  except  that  a  fifth  trial  for  each  map  type  was  added.   In  this 
trial,  the  legend  was  removed,  and  the  subjects  had  to  answer  the  questions 
with  the  map  and  their  memory  of  the  meaning  of  the  color  key.   It  should 
be  remembered  that  by  the  time  the  subjects  got  to  this  fifth  trial  they 
had  seen  36  bivariate  maps  in  the  first  experiment  and  36  more  in  this  one. 
Thus  for  the  bivariate  case,  they  had  made  72  previous  judgments  as  well 
as  6  practice  tests.   In  the  univariate  case,  they  had  seen  72  blue  maps 
and  72  red  ones.   It  was  felt  that  this  was  a  sufficient  amount  of  practice 
to  assure  that  most  learning  that  was  going  to  occur  had  occurred.   Subjects 
A,  6,  7  and  15  had  bivariate  maps  first;  2,  5,  10  and  13,  univariate. 

RESULTS 

Response  times  and  error  counts  for  each  subject  and  all  ten  trials 
are  shown  in  Tables  4  and  5.   The  message  is  clear.   There  is  no  signifi- 
cant change  in  either  error  rate  or  response  time  for  the  univariate  maps 
when  the  legend  is  absent.   For  the  bivariate  case,  response  times  increase 
and  error  rates  go  through  the  roof. 
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Table  5 
Experiment  2  -  Errors  per  Trial  (out  of  a  possible  18) 

Fifth  Trial  was  Without  Legend 


MAP  TYPE 


Bivariate 

Univariate 

Bivariate 

Tria 

1 

Tria 

1 

Tria 

1 

1 

2 

3 

4 

5 

1 

2 

3 

4 

5 

1 

2 

3 

4 

5 

Subject 

4 

0 

2 

1 

2 

8 

0 

0 

0 

0 

0 

6 

0 

3 

3 

4 

12 

1 

0 

1 

0 

1 

7 

0 

1 

1 

1 

1 

0 

0 

0 

0 

0 

15 

2 

0 

0 

2 

11 

1 

0 

1 

0 

1 

2 

0 

2 

0 

0 

0 

0 

1 

1 

0 

6 

5 

0 

1 

1 

0 

0 

0 

2 

1 

0 

5 

10 

0 

3 

0 

1 

0 

4 

2 

2 

1 

8 

15 

0 

1 

1 

0 

0 

0 

2 

1 

0 

5 

Shown  in  Tables  6  and  7  are  some  of  the  results  of  the  analysis. 
Table  6  indicates  clearly  that  the  only  difference  in  error  rates  is  be- 
tween the  two  map  types,  with  the  univariate  maps  being  vastly  superior. 
Once  again  there  are  very  large  individual  differences  among  the  subjects 
in  response  time,  and  there  are  learning  effects.   There  is  also  an  AxC 
interaction  in  the  time  variable  which  reflects  the  notion  that  it  is  a 
better  training  procedure  to  go  from  the  simple  to  the  complex,  rather 
than  vice  versa.   This  is  the  only  significant  interaction. 

Table  6 
F  Ratios  from  an  Analysis  of  Variance  of  Experiment  2 


Factor 


df 


Sin-W 


Errors 
18 


i 


Time 


A. 

Order 

B. 

Subjects 

C. 

Map  Type 

D. 

Trials 

.3 

.8 

18.4 

.9 


.4 

28.7 

1.9 

3.2 
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In  Table  7  are  the  means  of  the  transformed  variables  for  Trials  4 
and  5.   The  increase  in  response  time  for  Trial  5  is  significant  for  the 
bivariate  maps.   It  is  not  for  univariate  maps.   There  is,  as  suspected 
from  Table  5,  a  huge  increase  in  error  rates  from  Trial  4  to  5  for  bivar- 
iate maps,  none  for  univariate.   An  analysis  of  residuals  from  the  time 
analysis  shows  that  there  is  a  definite  negative  association  between  re- 
siduals and  number  of  errors  after  accounting  for  Factors  A,  B,  C,  and 
D  and  interactions  (if  we  ignore  those  who  made  no  errors).   The  conclusion 


Table  7 
Performance  Degradation  with  Removal  of  the  Legend 


Time  Means 


Bivariate    Univariate 


Error  Means 


Bivariate    Univariate 


Trial  4 
Trial  5 


11.0 
12.7 


11.4 
12.0 


24 
67 


.06 
.06 


we  can  draw  is  that  for  those  who  perform  the  task  less  than  perfectly, 
there  is  a  speed  accuracy  tradeoff.   This  pattern  is  clear  in  Experiment  2, 
less  so  in  Experiment  1.   These  results  are  shown  in  Table  8. 
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Table  8 
Analysis  of  Residuals  from  Time  Analysis 


No.  of  Errors 


Experiment  1 


No.  Cases    Residual 


Experiment  2 


No.  Cases    Residual 


0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 


68 

.10 

24 

-.52 

17 

.50 

11 

-.65 

3 

-.37 

2 

-.54 

0 

0 

1 

3.73 

0 

0 

1 

3.06 

0 

37 

-.09 

22 

.65 

9 

.53 

3 

.47 

2 

-.08 

1 

-2.09 

1 

-2.72 

1 

-2.52 

2 

-2.32 

0 

0 

1 

-2.30 

1 

-2.55 

VI.      CONCLUSIONS  AND  DISCUSSION 
The  results  from  these  two  modest  experiments  point  clearly  toward 
one  conclusion:   the  search  for  an  effective  display  technique  for  bivar- 
iate  data  on  a  geographic  background  must  continue.   Experiment  1  indicat- 
ed that  reading  the  two  variable  color  map  at  the  Elementary  level  is  not 
possible  with  acceptable  accuracy.   Thus,  the  only  purpose  for  such  a  map 
would  be  for  its  virtues  at  the  Intermediate  or  Superior  level.   However, 
in  order  for  reading  at  these  levels  to  be  possible,  the  legend  must  be 
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internalized.   Experiment  2  showed  that  merely  memorizing  this  legend  is 
very  difficult,  if  not  impossible.   Memorization  of  the  legend  is  a  neces- 
sary, though  not  sufficient,  condition  for  its  internalization.   To  truly 
internalize  the  legend  and  so  be  able  to  "see"  its  structure  more-or-less 
spontaneously  is  more  difficult  still.   Consequently,  reading  this  type  of 
display  at  these  levels  is  at  the  very  least  difficult,  and  may  be  impos- 
sible. 

Note  that  not  being  able  to  read  a  display  at  an  Elementary  level 
does  not  preclude  the  possibility  of  its  being  read  at  some  higher  level. 
For  example,  consider  a  bivariate  scatter  plot  with  axes  labeled  only  at 
their  ends  "high"  and  "low".   The  determination  of  any  particular  point 
cannot  be  made  accurately,  but  one  can  quite  accurately  estimate  the  degree 
of  relationship  between  the  two  displayed  components  (Intermediate  level), 
and  compare  this  structure  to  that  of  other  bivariate  plots  (Superior 
Level) . 

The  learning  of  the  legend  does  not  mean  that  the  legend  is  internal- 
ized sufficiently  for  visual  comprehension  of  forms.   This  is  easily  demon- 
strated.  Consider  a  bivariate  scatter  plot.   Suppose  we  divide  it  up  into 
small  square  partitions  and,  instead  of  showing  a  dense  array  of  dots  within 
each  square,  we  represent  the  number  of  dots  by  their  number  (in  dense  areas 
we  plot  a  "7"  or  an  "8"  or  a  "9",  in  more  sparse  areas  a  "1"  or  a  "2").   Of 
course,  we  must  take  care  to  shrink  such  symbols  as  "8"  so  that  the  amount 
of  ink  on  the  paper  is  the  same  as  for  a  "1",  and  so  the  perceptual  clues 
from  the  somewhat  graphical  character  of  the  Arabic  numeral  system  are  re- 
moved.  There  are  few  symbols  that  are  so  well  learned  as  the  numbers,  yet 
we  contend  that  such  a  scatter  plot  would  not  offer  a  very  compelling  pic- 
ture of  the  bivariate  structure  of  the  data.   This  may  be  one  of  the  reasons 
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that  the  statisticians  in  Relies  and  Rogers'  (1977)  experiment  were  only 
"fairly  robust". 

Much  of  our  argument  about  the  inutility  of  two  variable  color  maps 
rests  on  the  notion  that  the  legend  scheme  for  these  maps  cannot  be  easily 
learned,  and  further  that  even  if  it  could  be  learned,  that  a  spontaneous 
perception  of  the  structure  through  such  a  learned  legend  is  difficult,  if 
not  impossible.   There  is  some  evidence  that,  when  a  coding  scheme  is  very 
well  learned,  the  sort  of  instantaneous  perception  useful  for  the  Intermed- 
iate and  Superior  levels  of  readings  can  take  place.   Chase  and  Simon  (1973), 
in  their  study  of  chess  players,  showed  that  when  a  chess  board  configuration 
was  shown  for  a  very  brief  amount  of  time,  chess  experts  could  almost  per- 
fectly reconstruct  the  men  on  the  board,  so  long  as  the  men  occurred  in  po- 
sitions which  could  logically  have  been  arrived  at  during  the  course  of  a 
game.   Novices  could  only  reconstruct  a  few  positions  in  this  case,  and 
neither  the  masters  nor  the  novices  were  very  successful  at  reconstructing 
the  board  when  the  men  occurred  randomly  on  the  board. 

This  chess  example  is  one  of  a  very  limited  set  of  experimental  evi- 
dence we  could  find  which  supports  the  possibility  of  learning  a  symbolic 
legend  sufficiently  well  to  allow  a  spontaneous  perception  of  structure. 
It  must  be  emphasized  that  the  subjects  who  manifested  such  perceptions  were 
beyond  the  level  of  casual  player,  and  had  typically  devoted  untold  hours 
to  the  study  of  chess  boards. 

It  may  be  that,  given  a  similar  amount  of  assiduous  study,  even  these 
two  variable  color  maps  can  yield  useful  perceptions.   However,  if  they  are 
meant  as  communicative  devices  among  individuals  who  are  merely  acquainted 
with  the  display  technique  and  not  wedded  to  it,  our  earlier  conclusion 
about  the  necessity  for  another  display  scheme  still  holds. 
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Our  experiments  did  not  test  whether  or  not  someone  could  see  aggre- 
gations of  same-colored  sections.   It  would  seem  that  this  is  another  pos- 
sible use  for  these  maps  (e.g.,  "All  these  areas  go  together,  they're  yel- 
low").  This  is  a  very  narrow  use,  for  it  can  depend  crucially  on  the  pre- 
cise boundaries  used  to  determine  the  coloring  scheme.   It  would  seem 
that  a  general  recognition  of  what  is  identical  can  be  done  to  some  ex- 
tent, but  what  groups  are  similar  cannot  (Is  light  blue  more  like  yellow 
than  dark  blue?).   Thus  two  variable  color  maps  may  have  some  utility  for 
higher  level  reading  in  spotting  groupings  of  identical  areas,  but  this 
is  so  limited  that  we  don't  believe  it  is  a  feature  that  should  save  these 
maps  from  being  discarded. 

Returning  now  to  the  question  of  effective  bivariate  display  of  two 
quantitative  components  on  a  geographic  background,  let  us  consider  some 
alternatives  which  are  currently  under  study. 

(a)  Remove  "yellow"  as  the  representative  of  the  lowest  degree  and 

substitute  the  more  logically  consistent  "white".   Then 
cross  the  two  colored  components. 

(b)  Use  one  of  the  monochrome  figurations  described  earlier  for  the 

representation  of  one  component,  and  varying  saturations 
of  color  for  the  second. 

(c)  Omit  the  use  of  color  entirely,  and  use  two  of  the  previously 

described  monochrome  figurations  crossed. 

Of  course,  how  well  these  alternative  schemes  will  work,  and  under 
what  circumstances  can  only  be  determined  empirically.  Eliminating  one 
contender  may  not  make  the  picture  clear,  but  it  does  make  it  clearer. 

We  have  not  discussed  in  too  much  detail  the  individual  differences 
found  in  our  experiments.  These  differences,  which  we  shall  term  GRAPH- 
ICACY,  are  seen  most  clearly  in  response  time  measures  and  less  in 
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accuracy.   This  is  not  the  case  when  the  subject  sample  is  less  homogeneous. 
Thus  one  important  step  in  the  development  of  a  graphical  format  for  a  set 
of  information  is  careful  consideration  of  the  anticipated  audience.   W.  H, 
Kruskal  (Note  2)  dislikes  the  use  of  color  because  he  possesses  a  form  of 
color-vision  deficiency  which  makes  the  requisite  discriminations  for  cer- 
tain colorations  impossible  for  him.   Biderman  (Note  3)  replied  that  color 
vision  defects  have  low  prevalence  and  the  prospective  displayer  should  not 
sink  to  the  lowest  common  denominator  of  the  prospective  audience.   Further, 
he  was  "positively  attracted  to  anything  that  Bill  Kruskal  could  do  LESS 
WELL  than  the  average  bloke." 
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2 
It  may  be  that  the  harder  we  have  to  work  to  extract  information,  the  more 

we  believe  it.   Prose  can  often  communicate  information  more  quickly  than 

a  graphic,  which  may  be  itself  more  efficient  than  a  table.   Yet  we  tend 

to  believe  our  own  findings  from  someone  else's  plot  more  than  his  words  — 

even  though  one  often  can  be  falsified  as  easily  as  the  other.   Following 

the  same  logic,  perhaps  14  pt.  type  is  less  believable  than  7  pt . ? 

The  epistemological  basis  of  this  formulation  was  clearly  stated  by  the 
Harvard  mathematician  C.S.  Pierce  (in  Gardner,  1978).  He  felt  that  all 
things  could  be  ordered  into  monads,  dyads  and  triads.   Gardner  states, 

"Firstness  considers  a  thing  all  by  itself,  for  example  redness  .  .  . 
Secondness  considers  one  thing  in  relation  to  another,  for  example  a 
red  apple  .  .  .  Thirdness  concerns  two  things  'mediated*  by  a  third, 
for  example  an  apple  falling  from  a  tree.   The  tree  and  the  apple 
are  linked  by  the  relation  'falling  from'  .  .  .  Pierce  applied  first- 
ness, secondness  and  thirdness  to  every  branch  of  philosophy.   There 
is  no  need,  he  argued,  to  go  on  to  fourthness  or  fifthness  and  so  on, 
because  in  almost  every  case  these  higher  relations  can  be  reduced 
to  combinations  of  firstness,  secondness  and  thirdness.   On  the  other 
hand,  genuine  thirdness  can  no  more  be  reduced  to  secondness  than  can 
geniune  secondness  to  firstness."  (p.  23) 
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SYMBOLS  FOR  DISPLAY  OF  MULTIVARIATE  DATA: 
THE  FACE 

Robert  J.    K.    Jacob 
Naval  Research  Laboratory 

I .  INTRODUCTION 
People  are  well-known  to  be  proficient  at  processing  visual  infor- 
mation (Entwisle  and  Huggins,  1973).   They  can  do  sophisticated  processing 
tasks,  almost  below  the  level  of  consciousness,  when  the  data  are  presented 
graphically  (Arnheim,  1969).   Until  the  advent  of  computer  graphics,  how- 
ever, people  were  not  nearly  as  good  at  generating  graphical  —  or  iconic 
—  information  as  they  were  at  assimilating  it.   Hence  most  data  were  ac- 
tually communicated  using  symbols  —  in  the  symbolic  mode,  rather  than  the 
iconic  mode.   Now,  the  problem  has  become  how  best  to  use  the  iconic  mode 
for  communicating  information  (Huggins  and  Entwisle,  1974) .   While  there 
are  some  traditional  iconic  techniques,  such  as  maps  and  Cartesian  graphs, 
given  the  new  capabilities,  it  becomes  worthwhile  to  look  for  other,  per- 
haps better  and  richer  ways  to  use  the  iconic  mode  for  communicating  infor- 
mation. 

A  novel  iconic  device  for  communicating  multidimensional  numerical  in- 
formation was  proposed  by  Herman  Chernoff  (1971,  1973).   This  was  the  car- 
toon human  face.   People  look  at  and  process  faces  constantly.   They  have 
become  well  adapted  to  this  task  and  are  extremely  good  at  performing  it. 
Hence  people  would  be  expected  to  perform  visual  processing  on  faces  better 
than  on  otherwise  comparable  visual  stimuli.   In  fact,  some  evidence  sug- 
gests that  the  perception  of  faces  is  a  special  visual  process  (Yin,  1970). 

To  represent  multidimensional  numerical  data  facially,  variation  in 
each  of  the  coordinates  of  the  data  is  represented  by  variation  in 
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one  characteristic  of  some  feature  of  the  cartoon  face.   For  example,  the 
first  component  of  the  data  might  be  represented  by  the  length  of  the  nose. 
Other  components  would  be  represented  by  others  of  the  18  possible  parame- 
ters, such  as  the  curvature  of  the  mouth,  separation  of  the  eyes,  width  of 
the  nose,  and  so  on.   Then,  the  overall  value  of  one  multidimensional  datum 
would  be  represented  by  a  single  face.   Its  overall  expression  —  the  ob- 
server's own  synthesis  of  the  various  individual  features  —  would  consti- 
tute a  single  image  depicting  the  overall  position  of  the  point  in  its 
multidimensional  space.   The  variety  of  possible  facial  expressions  would 
represent  the  variation  possible  in  a  set  of  numerical  data.   By  looking  at 
the  faces  and  applying  one's  innate  visual  processing  abilities  to  them,  an 
observer  could  perform  the  visual  equivalents  of  such  tasks  as  multivariate 
clustering  or  identifying  outliers  as  easily  as  he  notices  family  resem- 
blances between  people,  and  by  precisely  the  same,  almost  unconscious, 
mental  mechanism. 

Figure  1  shows  how  the  faces  are  used  to  represent  data.   Here,  each 
face  represents  the  value  of  an  eight-dimensional  datum  chosen  from  an  un- 
correlated  multivariate  normal  distribution.   One  datum  differs  significantly 
from  the  remaining  19  on  several  dimensions.   It  is  rather  clearly  and  rapid- 
ly identifiable  (by  a  facial  expression  which  differs  from  the  remaining  19), 
despite  the  presence  of  considerable  noise  from  the  normal  distribution. 
(It  is  Face  4) . 

Several  changes  were  made  to  Chernoff 's  original  faces  for  these  studies. 
The  nose  was  changed  from  a  line  to  a  triangle,  and  its  width  became  an  ad- 
ditional variable.   Chernoff1 s  face  height  and  width  parameters  were  re- 
placed by  size  and  aspect  ratio,  which  better  match  perceived  dimensions. 
Some  discontinuities  in  effects  of  changes  in  face  outline  parameters  were 
remedied  by  providing  a  set  of  ratio  parameters  for  the  outline.   Finally, 

it  was  found  that  reducing  the  range  of  variation  on  most  parameters  gave 
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FIGURE  1.   Exanple  of  a  Facial  Display  of  an  Intensive  Care  Unit 
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a  more  realistic  set  of  faces;  these  were  preferred  because  people  are  es- 
pecially attuned  to  very  small  variations  in  realistic  faces.  (See  Jacob, 
1976a,    1976b)    for    the   computer   program  used    to   generate   the   faces.) 

II.       COMPARING  FACES    TO   OTHER   DISPLAYS 
The   first    set   of    experiments    (Jacob,    Egeth,    and   Bevan,    1976)    was    in- 
tended   to   ascertain  whether    subjects   could   perform   common  or   useful   tasks 
better   with  data  displayed   as   faces   or   as   traditional    iconic   or    symbolic 
displays.      In   each  of    the   two    experiments,    subjects   performed   a    simple   task 
involving  a   set   of    synthetic  data.      Performance  was  compared   between   sub- 
jects who  were  given  the  facial   representation  for   the  data  and   those  who 
were  given  other   representations    (Chernoff   and   Rizvi,    1975;    Mezzich  and 
Worthington,    1978) . 

Experiment   1 

The  task  in  the  first    experiment   was  paired-associate  learning,    a 
simple,    standard   task.      It   consists  of   asking   subjects   to   learn  to  associate 
a  name  with  each  data  point.      Twelve   such  points  were  represented   by  digits, 
"glyphs"    (Anderson,    1960),    polygons    (Siegel,    Goldwyn,    and    Friedman,    1971), 
inverted   faces,    and    faces.      Each  of    these  displays    is    illustrated    in   Fig- 
ure  2.      The   entire  procedure  was   repeated    for   three  different   dimensionali- 
ties.     A  total  of    120  subjects  were  used. 

Results   revealed   a  variety  of    effects,    some  mutually   confounding.    There 
was  a   clear   dimensionality   effect  as   expected;    subjects  performed   better 
on  points   in  higher-dimensional   spaces,    since  they  contained  a  greater  am- 
ount  of  memorizable   information    (Egeth,    1966) .      Because  the  digit   displays 
lent   themselves   to   rote  rehearsal,    they   induced   rather   good   performance. 
The  three-dimensional   polygons  gave  particularly  poor   performance,    probably 
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FIGURE  2.   Example  Stimuli  from  Experiment  1 
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because  they  are  all  perspective  transforms  of  each  other.  The  glyphs  were 
difficult  to  organize  perceptually  and  therefore  to  code,  and  they  resulted 
in  generally  poor  performance.  The  overall  result,  however,  was  that  faces 
were  at    least   as   good   as   any  of    the  other   displays,    and    often   better. 

The  most    interesting   observation  was   that    the   inverted   faces   tended    to 
be  difficult    to    learn   to   label,    especially   when   they  were  of    low  dimension- 
ality.     Inverted   faces  provide  a  good   control   for   perceptual   factors   such 
as   complexity,    symmetry,    and    integrality   of    elements,    but   they   lack  the   fa- 
miliarity of    faces.      Hence  this   finding    suggests   that    familiarity  may   be  an 
important   factor    in  explaining   the   superiority  of   the  face  display.      That 
is,    the   face  does   not   appear    to   be   simply   one   of   a   number   of   possible  geo- 
metrically well-designed  displays,    but,    rather,    it   has  unique  properties. 

Experiment   2 

While   the  paired-associate  learning   task  was  a   standard   psychological 
research  task,    it  was  not   the   sort  of   task  to   which  the   faces  were   intended 
to  be  applied    in  practice.      The   second    experiment    investigated   a  realistic 
and  practically-useful   task.      This  was  clustering,    or   sorting   into   catego- 
ries,   or   pattern  recognition. 

The   task  consisted   of   a    set   of    50   points    in  a   nine-dimensional    space, 
which  were  to   be  organized    into   five  groups.      They  were  generated    in  five 
clusters,    each  normally  distributed   around   a   center   point,    called    the 
prototype.      The   subject's   task  was   to   look  at    the  five  prototypes   and    then 
assign  each  of    the   50  deviants   to  a  cluster   surrounding  one  of   the  five 
prototypes.      The  correct   answers  were   those  which  put   deviants  with  the 
prototypes   from  which   they  were  generated,    and    to   which   they  were  closest 
in  Euclidean  distance.    (Conventional   cluster   analysis   procedures  were  also 
applied   to   the  deviants,    and    they   successfully  grouped    together   those 
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deviants  that  were  associated  with  each  prototype.)   While  this  was  a  con- 
trived task  in  that  questions  were  derived  from  the  answers,  it  was  out- 
wardly similar  to  many  realistic  tasks.   In  a  real  task,  the  subject  would 
have  the  five  prototypes  in  his  mind,  abstracted  from  his  experience  or 
training.   He  would  look  at  a  new  data  point  and  assign  it  to  one  of  the 
groups  he  knew.   For  example,  a  doctor  would  examine  the  data  on  a  patient 
and  then  assign  him  to  a  cluster  that  represents  a  particular  disease. 

As  in  the  first  experiment,  the  55  data  points  were  represented  in  sev- 
eral different  ways,  and  subjects  performed  the  same  task  with  the  differ- 
ent displays:   faces,  a  second  set  of  faces  with  the  range  of  possible  var- 
iation reduced  to  three-fourths  that  of  the  first,  polygons  as  in  the  first 
experiment,  and  digits.   Figures  3  through  6  present  the  prototypes  (top 
row  of  each  figure)  and  examples  of  their  deviants  (succeeding  rows)  for 
the  four  different  display  types  respectively.   Polygons  were  used  here  be- 
cause they  had  been  found  to  be  the  better  of  the  two  alternate  graphic 
displays  used  in  the  first  experiment,  probably  because  their  elements  are 
better  integrated  (Garner,  1974). 

Results  consisted  of  the  number  of  errors  subjects  made  in  classifying 
the  50  points.   Table  I  shows  the  mean  number  of  errors  (chance  performance 
would  give  40  errors)  they  made  and  the  mean  time  (in  seconds  per  card)  they 
took  in  sorting  the  50  cards.   The  two  types  of  faces  were  found  clearly  to 
be  superior  to  both  the  polygons  and  the  digits  at  p_<^  0.001.   Significant 
differences  were  not  found  between  the  two  face  types  or  between  the  poly- 
gons and  digits.   While  the  polygons  could  be  sorted  as  quickly  as  the 
faces,  they  were  not  sorted  correctly. 

The  conclusion  drawn  was  that  subjects  performed  a  realistic  and  useful 
task  significantly  better  when  the  data  were  represented  by  faces  than  when 
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FIGURE  3.   Example  Stimuli  from  Experiment  2- -Faces 


121 


CO    ^ 


FIGURE  4.   Example  Stimuli  from  Experiment  2- -3/4  Range  Fa 
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FIGURE  6.   Example  Stimuli  from  Experiment  2--Digits 
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TABLE  I.   Results  of  Experiment  2  —  24  Subjects 


Faces 
Faces     (3/4  range)    Polygons    Digits 


14.46 

17.29 

28.21 

31.38 

5.50 

5.95 

5.27 

7.50 

4.97 

4.88 

4.43 

9.89 

1.96 

1.75 

1.72 

3.86 

Mean  no .  wrong 
Standard  dev. 

Mean  time  (sec. /card) 
Standard  dev. 


they  were  represented  by  a  conventional  display  (digits)  or  by  a  well-in- 
tegrated graphic  display  (polygons) .   As  the  experimental  task  is  a  fairly 
general  one,  one  underlying  many  specific  data  analysis  tasks  such  as  diag- 
nosis, pattern  recognition,  and  cluster  analysis,  it  is  claimed  that  faces 
provide  a  superior  display  for  many  multivariate  applications.   Subjects' 
comments  on  the  experiment  help  explain  this  result.   They  reported  that 
they  put  all  the  "happy"  faces  in  one  pile,  "angry"  ones  in  another,  -and  so 
on;  they  found  this  easy  to  do.   In  fact,  because  of  the  representation, 
they  were  performing  a  fairly  sophisticated  multivariate  clustering  task 
accurately  using  only  their  visual  processing  abilities.   For  the  other 
displays,  they  reported  inventing  more  complicated  strategies,  which  turned 
out  to  be  self-defeating. 

This  synthesis  by  the  observer  himself  of  the  various  graphical  ele- 
ments of  the  facial  display  into  a  single  gestalt  is  one  of  the  principal 
advantages  of  this  type  of  iconic  display.   Many  other  common  types  of 
displays  contain  several  variable  elements  and  could  thus  be  used  for 
graphing  multivariate  data;  but  often  such  displays  predispose  toward  a 
piecemeal,  sequential  mode  of  processing,  which  obscures  the  recognition 
of  relationships  among  elements.   By  contrast,  faces  induce  their  observer 
to  integrate  the  display  elements  into  a  meaningful  whole.   Previous  re- 
search with  simple  cartoon  faces  and  with  photographs  of  real  faces  has 
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indicated  that  observers  do  indeed  process  these  stimuli  in  such  a  wholis- 
tic  fashion  (Yin,  1969;  Smith  and  Nielsen,  1970;  Reed,  1972). 

III.       USING    THE  FACE   DISPLAYS 

Having  supported  the  initial  supposition  that  the  faces  provide  a  dem- 
onstrably good  display  for  Euclidean  data  of  several  dimensions,  the  prob- 
lem of  displaying  a  specific  type  of  actual  (rather  than  synthetic)  data 
using  faces  was  addressed.   The  data  selected  for  this  purpose  were  the  re- 
sults of  a  psychological  test  intended  to  determine  a  patient's  psycholog- 
ical personality  profile.   It  was  thought  that  such  a  profile  might  possess 
a  more  natural  facial  representation  than  most  other  sorts  of  data. 

The  form  the  data  took  was  the  results  of  five  particular  scales  of  the 
Minnesota  Multiphasic  Personality  Inventory  (MMPI)  (Hathaway  and  McKinley, 
1942).   The  U.S.  Public  Health  Service  Hospital  in  Baltimore  administers 
this  test  to  patients  as  part  of  a  comprehensive  health  testing  and  evalu- 
ation and  was  interested  in  alternate  ways  to  display  the  test  results. 
The  hospital  uses  five  of  the  clinical  scales  of  the  MMPI :   Hypochondriasis, 
Depression,  Paranoia,  Schizophrenia,  and  Hypomania. 

Following  the  approach  both  of  Chernoff  and  of  the  previous  two  experi- 
ments, the  five  components  of  an  MMPI  data  point  could  simply  have  been  as- 
signed arbitrarily  to  five  of  the  facial  features  (while  the  unused  fea- 
tures were  kept  at  constant  values) .   The  resulting  facial  expressions  and 
the  personality  traits  or  disorders  which  each  represents  would  then  be 
learned  by  doctors,  just  as  they  have  learned  the  meanings  of  the  numerical 
data  and  the  graphs  presently  in  the  patient  reports.   However,  it  has  been 
widely  observed  (e.g.,  Secord,  Dukes,  and  Bevan,  1954;  Harrison,  1964)  that 
particular  facial  expressions  tend  to  signify  particular  personality  traits 
to  observers  with  great  consistency.   Therefore,  if  the  face  displays  could 
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be  devised    in    such  a   way   that    the   expression   on   the  cartoon   face   suggested 
the   same   personality   traits   as   those   in   a   particular   MMPI   report,    the  re- 
sulting   face  displays   would    tend    to   communicate   the  meaning   of    the  data 
they   represent    intuitively.      To    this    extent,    a    self-explanatory   display 
would    have   been  constructed,    somewhat    like   an   hypothetical   graph   in  which 
it    is   not   necessary    to    label   the  axes,    because   the  meaning   of    the  curve 
is    inherently  obvious. 

Consider,    for    example,    a   particularly  unfortunate  arbitrary  assignment 
of   MMPI    scales   to   facial    features,    in  which  a    smile   on   the  face    signified 
a   patient    suffering   from   severe  depression.      While   this   could   certainly   be 
learned,    just   as   the   letters   depression  or    the   shape   of    the  personality 
profile  are   learned,    such   training   would   clearly   be   a   poor   utilization   of 
the  observer's   skills. 

Therefore   experiments   were  undertaken   to   attempt    to   obtain  a   positive 
relationship   between   the   five  components   of    the  MMPI    score  vector   and    the 
18   variable   parameters   of    the   face  construction.       It   was   hoped    that    the  re- 
sulting  face  displays   would    be   highly    intuitive  and    suggestive;    unlike  most 
computer   output    formats  which  require  the   human  observer    to    learn   to  under- 
stand   the  computer's   language,    the   power   of    the  computer   would   here   be 
used   to   tailor   the  display   format    to    suit   and    exploit    the  person's    intui- 
tion and   preconceptions. 

Experiment   3 

The   first    experiment    in   this   study  attempted    to  measure  a   relationship 
between  MMPI    scores   and    face  parameters   based   on  one  observer's   preconcep- 
tions  or    stereotypes.      This   corresponded    to   a   transformation   between  the 
five-space   of   MMPI    scores   and    the   18-space  of    face   parameters.      Because  of 
the   imprecision    in   the  process   of   perception   of    personality   from   faces,    it 


127 


was   hoped    that   a   linear  model   would   be   sufficiently  accurate   for   useful   re- 
sults.     Subsequent   analysis   of   the   experimental   results   for   higher -order 
interactions    showed    this   to   be  a   reasonable  choice.      Moreover,    the  dimen- 
sionality of   the  problem  made  any  other  model   very  much  more  difficult    to 
study.      Thus   a  matrix    (T)    was   proposed   to  define  a    linear    transformation 
from   the   space  of   MMPI    score  vectors    (d   for   diagnosis)    into    that   of    face 
parameters    (p) . 

A   set   of    200   faces  was   generated   using   parameter    (p)    vectors   chosen 
from  an   18-variate  uniform  random  distribution.      Figure  7    shows  a    sample  of 
these   faces.      Dr.    Faith  Gilroy,    a   research  psychologist   at    the  Public 
Health  Service  Hospital,    then   rated    each  of    the   faces   on   the   five   scales. 
She  was,    in   effect,    indicating   what   MMPI   results    each  of    the   faces   signi- 
fied  to   her,    or,    more   specifically,    what   MMPI    score   she  thought   a   person 
who    looked    like   each  of    the   200   faces   would    receive. 

A  multiple   linear   regression  of    the  p  vectors   on  the  d's  was   computed 
from   200  pairs   of    such  vectors,    producing   a   T  matrix  of   regression   coef- 
ficients   (Jacob,    1976a).      That   matrix   could    then   be  used    to   estimate  a   p 
vector    (or    face  drawing)    for   any   given  d   vector    (or   personality   score). 
Such   estimated   p  vectors   were  computed   and   compared    to   the  original    (stim- 
ulus)   p  vectors;    the  mean   squared    error   over   all   components   of   all   the 
vectors  was   0.07497    (components   of    the  p  vectors   ranged   between   zero   and 
one)  . 

The  T  matrix  was   displayed   graphically  by   computing   the  p  vectors   that 
correspond    to   equally-spaced   points   along    the  axes   of   the  d    space    (that 
is,    points   that   represent   patients   who   have  only  one  psychological  dis- 
order) .      Figure  8    shows   the  resulting   display.      In    it,    each  row  depicts  a 
series   of   patients   with   increasing   amounts   of   a    single  disorder.      Because 
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FIGURE  7.   Example  Stimuli  from  Experiment  3 
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FIGURE    8.       Facial    Representation   of    the    T  Matrix 
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of  the  rating  scale  used,  zero  (the  first  column)  represents  an  inverse 
amount  of  the  disorder,  one  represents  no  disorder  (the  origin  of  the  d 
space),  and  four  represents  a  large,  extrapolated  amount  of  the  disorder. 
It  was  thought  that  these  faces  (particularly  those  in  the  column  labelled 
three)  actually  corresponded  to  common  stereotypes  of  the  personality 
traits  they  were  claimed  to  represent.   The  subject  had  never  seen  these 
faces  nor  any  resembling  them;  rather,  they  had  been  deduced  from  the  lin- 
ear regression  using  faces  reported  to  have  more  than  one  disorder. 

Some  comparisons  were  made  between  this  T  matrix  and  results  obtained 
by  previous  investigators.   While  no  studies  had  used  stimuli  of  this  com- 
plexity or  the  same  rating  scales,  some  of  the  observed  relations  between 
basic  facial  feature  variations  and  basic  emotions  were  confirmed.   Compar- 
ison to  the  work  of  McKelvie  (1973)  and  of  Harrison  (1964)  corroborated  both 
the  major  axes  of  facial  variation  found  in  the  T  matrix  and  their  relation- 
ships to  variation  in  emotional  states.  (As  one  might  suspect,  these  all  sug- 
gest that  the  joint  variation  in  the  mouth  and  eyebrows  are  the  major  deter- 
minants of  emotional  content  of  the  facial  expression;  that  variation  induces 
variations  along  axes  comparable  to  Paranoia,  Depression,  and  Hypochondriasis.) 

An  attempt  was  made  to  determine  the  important  factors  or  axes  incor- 
porated in  the  transformation  T  by  performing  a  canonical  correlation  anal- 
ysis on  the  200  pairs  of  p  and  d  vectors  (Morrison,  1967;  Tasuoka,  1971). 
This  procedure  finds  pairs  of  axes  in  the  p  and  d  spaces  in  such  a  way  that 
the  correlation  between  each  such  pair  is  maximized  and  there  is  no  correl- 
ation between  any  two  pairs.   Thus,  all  of  the  correlation  between  the  two 
sets  of  data  is  contained  in  the  correlations  between  pairs  of  correspond- 
ing axes;  and  the  axes  in  each  space  are  mutually  orthogonal.   The  first 
three  of  the  axes  so  discovered  were  found  to  be  statistically  significant 
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(_p_<  0.001);  they  gave  correlations  of  0.747,  0.652,  and  0.455.   These  axes 
are  plotted  facially  in  Figure  9,  in  the  same  manner  as  those  in  Figure  8, 
except  that  the  origin  is  labelled  zero  here  (center  column) ,  and  the  other 
columns  show  points  lying  .5  and  one  standard  deviation  away  in  either  di- 
rection. 

The  first  axis  appears  to  be  related  to  a  "happy-sad"  dimension;  its 
counterpart  in  the  d  space  had  large  and  opposite  loadings  on  the  Depres- 
sion and  Hypomania  scales.   The  second  axis  had  a  large  loading  on  Para- 
noia in  one  direction  and  on  Hypochondriasis  and  Hypomania  in  the  other, 
suggesting  that  eye  angle  is  communicative  of  a  range  of  expressions  from 
anger  or  intensity  to  a  vacant  or  helpless  look.   The  third  axis  is  less 
suggestive;  it  seems  to  take  up  much  of  the  remaining  variability  in  the 
facial  features.   The  first  two  axes  correspond  closely  to  the  results  of 
the  previous  investigators  cited,  and  they  are  also  intuitively  plausible. 
These  axes  could  be  used  to  suggest  a  new  set  of  coordinates  for  face  par- 
ameters that  match  the  perceptual  space;  but,  strictly,  they  are  only  per- 
tinent to  the  representation  of  the  MMPI  data  under  study. 

An  additional  computation  shows  that  the  angles  in  the  p  space  between 
the  facial  representations  of  the  orthogonol  axes  of  the  d  space  ranged 
from  70  to  112  degrees,  suggesting  that  the  orthogonal  d  axes  were  indeed 
perceived  as  being  related  to  orthogonal  variations  in  their  facial  rep- 
resentations. 

Thus,  Experiment  3  provided  a  linear  transformation  from  MMPI  scores 
to  faces  which  was  both  intuitively  appealing  and  internally  consistent. 
Further  study  was  undertaken  in  order  to  validate  and  then  apply  this  re- 
lationship. 
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FIGURE  9.   Facial  Representation  of  the  Canonical  Axes  in  the  p  Space 
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Experiment  4 

An  attempt  was  made  to  replicate  the  previous  experiment  with  the  same 
and  with  another  subject.   A  new  set  of  random  faces,  generated  similarly 
to  the  first  set,  was  presented  to  two  subjects  who  rated  them  as  in  the 
previous  experiment.   Actual  responses  were  compared  to  those  predicted 
using  the  T  matrix  of  Experiment  3. 

The  comparison  was  confounded  by  the  appearance  of  significant  response 
bias.   That  is,  subjects  gave  consistently  higher  or  consistently  lower 
ratings  to  the  faces  on  certain  scales.   It  could  be  determined  that,  in 
those  cases  where  the  response  magnitudes  matched  the  predictions  (approx- 
imately half  of  the  data) ,  the  present  results  supported  the  previous  ones 
in  direction  as  well.   In  the  remaining  cases,  neither  support  nor  contra- 
diction could  be  asserted.   This  experiment  could  have  been  improved  by  em- 
bedding the  stimulus  faces  in  a  larger  group  which  would  have  induced  sub- 
jects to  attain  the  same  mental  set  (and  thus  the  same  response  bias)  as 
that  of  the  subject  during  Experiment  3.   Instead,  experience  from  this  ex- 
periment was  used  to  devise  a  new  experiment  which  would  provide  a  more 
powerful  test  of  the  transferability  of  the  T  matrix  relationship. 

Experiment  5 

For  the  relationship  T  to  be  valid  and  transferable  to  other  observers, 
it  must  appeal  to  intuitive  stereotypes  that  are  already  present  in  the 
minds  of  most  observers.   Such  stereotypes  neec  not  possess  any  absolute 
validity;  they  need  only  be  widely  and  uniformly  held  in  order  to  be  ex- 
ploitable in  devising  a  facial  display  for  MMPI  scores.   Thus,  this  experi- 
ment was  designed  to  test  the  applicability  of  the  stereotypes  already  dis- 
covered.  Untrained  subjects  in  the  experiment  were  asked  to  match  facial 
representations  of  random  hypothetical  MMPI  scores  to  alternate 
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representations   of    the   same  data.      Since   the  numerical   MMPI    scores  were 
not  meaningful    to   the   subjects    (or    to    the   intended    final   users   of    the   dis- 
play),   an   independently-developed    textual   representation   for   MMPI    scores 
(Rome,    et   al,    1962)    was   used    in   this   study. 

The   30   subjects  were   each  given    50   stimuli,    an   example  of   which  ap- 
pears   in   Figure   10.       In   each,    the   subjects   were  asked    to    indicate  which 
of   the   five   faces   given   best    corresponded    to    the  given   text   description. 
In   fact,    that   description   was   the   textual   representation  of   a   particular 
point    in   the  d    space.      One  of    the   five   faces   was   the   facial    (using    the  T 
matrix)    representation  of    the   same   point,    and    the  remaining    faces   were 
representations   of    other,    randomly-selected   points. 

The  principal   result   of    interest   was  whether   entirely  naive   subjects 
could    select    the   face   that   was   claimed    (by   the  results   of    Experiment    3)    to 
represent   the   same  MMPI  data  as   the  text   at    better   than  chance  performance. 
If    the  T  matrix   had   no   wider   validity   than   for   one   subject   at   one   time, 
the   subjects  would  not   perform  the  present   task;    if,    however,    the  matrix 
relationship   corresponded    to   widely-held    stereotypes,    the    subjects  would 
use   such  to   perform  this    task  better    than  a   random  guessing   hypothesis 
would   predict.      Results   were  obtained   by  measuring   the   Euclidean  distance 
in   the   five-dimensional  d    space  between   the   expected   answer   and    the  answer 
a    subject    chose.      Such  a   distance  could   range   from   zero    (correct   choice) 
to   4.5    (the  maximum  diagonal  dimension  of    the  hypercube) .      Table   II   pre- 
sents  these  data.      A  matched   t   test   on   the  data   revealed    that    subjects 
were  able,    with  highly   significant    (p_<  0.0005)    accuracy,    to   choose   those 
faces   which  were  designed    to   communicate  the   same   information  as   the   text 
items. 
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FIGURE    10.       Example    Stimulus    from   Experiment    5 
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TABLE   II.      Results   of    Experiment    5   —   30   Subjects 


Chance  performance  1.571 

Mean   Observation  1.226 

Standard   Deviation  0.146 

t(29)  12.975 


One   concern  was   that   measuring    the   Euclidean  distance   between  the  cor- 
rect  answer   and    the  answer   a    subject    chose   tended    to    emphasize  unimportant 
differences    (between   two   answers   that   were   both  far   from   the   correct   one) 
while  obscuring  more   important   differences    (between  correct   and  nearly- 
correct  answers).      To   correct   for   this,    the   transformation  EXP    (-DISTANCE) 
was  used  as  an  alternate  measure  for   scoring   the  experimental  results. 
Further,    the   scoring   was   repeated   using   the  reciprocal   of    the   rank-order 
distance,    in   order   to    eliminate   the   effect   of    the  arbitrarily-varying   dis- 
tances  between   the  randomly-generated    stimulus   points.*     However   all   of 
the  measures  used    in   scoring   gave   similar   results. 

The  results   of    this   experiment    suggest    that   the   faces   plus   the   T   trans- 
formation obtained   provide  a   data   display    that    requires   no    training   of    the 
observer.      Without   any   prior    information  other   than   their    innate   facial 
stereotypes,    subjects  were  able  correctly  to   perceive  the  data   being 


*It  was  originally    intended   that    the   four  wrong   answers   to    each   question 
in   this    "multiple- choice   test"   would   lie  at    the   same   four   distances   from 
the  correct   answer   for    every    stimulus.      However,    this  meant    that    those 
four   answers  would    be  distributed   near   the   surface   of   a   hypersphere 
with   its  center  at   the  correct   answer.      Even  when  the  four  distances 
differed    substantially,    the   five   faces    imparted   a    strong    sense   of    this 
geometric   relationship,    and,    as  a  result,    one  could  use   this  relation- 
ship  rather   easily   to   select   the  correct   answer    in   each  group  without 
seeing   the   question  or   knowing    the  geometry   explicitly.      Skewing   the 
distribution  of    the  directions    in  which   the  wrong   answers   lay   changed 
the  configuration  to   an  approximation  of   a   hyper-cone  with  the  correct 
answer   at    its  vertex,    but   a   comparable  difficulty  arose.      Finally, 
randomly-varying   distances   were   permitted,    and    the   experimental   re- 
sults were   scored   using   both  actual   and   rank-order   distances. 
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displayed . 

Experiment    6 

Experiment    5,    then,    demonstrated    that    the   faces   could   be  used    to   com- 
municate  psychological   data    to   naive   subjects.      Experiment    2    showed   that    a 
particular   useful   task  could   be  performed   better   and  more   quickly  with 
facially-represented   data   than  with   several   other   representations.      To- 
gether,   the   experiments    suggest    that   the   face  might    be  a    superior  mode  of 
displaying   the  MMPI   data  under   consideration.      The  present    experiment   was 
intended    to   test   this   composite  hypothesis   by  having    subjects   perform  a 
meaningful   and   realistic   task  which  requires   apprehension  of   MMPI   data. 
Various    subjects  would   perform   the   same   task  using   the  facial   and    textual 
representations   of    the   same  MMPI   data,    and    their   performance  would   be  com- 
pared. 

A  truly  realistic    task  would   be  the  diagnosis   and   treatment   of   a   real 
patient;    the  results  would   be  measured   by   evaluating   the  patient's   well- 
being   at    the  conclusion  of    the   treatment.      Unfortunately,    there  would   be 
far   too  many   confounding  variables    in   such  an   experiment    (as  well   as   prac- 
tical  problems) .      Instead,    a   crude   task,    analogous   to   psychological   triage, 
was   devised.      Subjects  were  asked    to   rate   the  overall   emotional   well-being 
of   an   hypothetical   patient,    given   his  MMPI    test    scores   presented    in  one  of 
two   ways.      Their    success   would    be  measured   by   comparing   their   responses   to 
the  responses   of   a   clinical   psychologist   who    studied   the  unprocessed   numer- 
ical  MMPI    scores.      Thus,    to   the   extent    that   a   naive   subject's   responses, 
using   the   facial   or    textual   representation,    corresponded   to   this   baseline, 
it   could   be  claimed    that,    through  the  use  of    that    representation   for    the 
data,    he  was   able  to   perform   the   same   task  as   the   trained   psychologist. 

Thirty-two    subjects  were   each  given   50   stimuli,    each  of   which 
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resembled  either  Figure  11  or  Figure  12.  In  each  case,  the  subject  was 
being  asked  to  rate  a  random  point  in  the  d  space  (represented  facially 
or    textually)    for    emotional   well-being. 

Results   were   obtained   by  measuring   the  correlation  coefficient    be- 
tween a    subject's  ratings   and    those  of    the  psychologist.      A  chance   hypo- 
thesis would   have  predicted    zero   correlation.      The  mean  correlation   scores 
over    subjects   are   presented    in  Table   III.      First,    one   can   observe   that 
subjects'    performance   exceeded   chance   expectation    significantly    (p_\   0.005) 
for    both   faces   and   text.      Next,    conventional   and   also   paired-observation 
_t    tests   were  made   to    find    the  difference   between   the  two   display   types. 
Both  tests    showed    that    subjects   performed   the   task   significantly    (_p_<  0.005) 
better   when  given   the   text    than  when  given   the   faces. 

Some   insight    into   this   unexpected    situation  may   be  gained   by   studying 
the   text   displays    in  more  detail.      It    appeared    that    the  more  disturbed   a 
patient   was,    the   longer   his   text   description  was.      Hence   subjects'    re- 
sponses   to   the   text   could    have  been   based   on   this    inadvertent    iconic   con- 
tent  of    the   text   display;    they   could   have   been  responding   to    the   quantity 
of    text    rather    than   to    its  meaning.      To   test    this,    an  algorithm   that   rated 
the   emotional   well-being   of   a   patient    based   only  on   the   quantity  of    text 
in   the   textual   representation  of   his   MMPI    score  was   applied    to    the   exper- 
imental   stimuli.      As    shown    in   the   table,    the  algorithm  achieved    slightly 
better   performance   than   the   subjects   who   used   the   text   display.      Thus 
the   superior   performance  of    the   text   displays   could   be   explained   by   their 
unintentional    iconic   content;    or,    illiterate   subjects   could   have  produced 
the   same  responses   from   the   text   displays   as   did   college   students. 

The  conclusions   of    this   experiment   are,    then,    unclear.      While  the 
text   displays    induced   better   performance,    this   turned   out    to    be   explainable 
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FIGURE  11.   Example  Face  Stimulus  from  Experiment  6 
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FIGURE  12.   Example  Text  Stimulus  from  Experiment  6 
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by  an  irrelevant  property  they  were  found  to  possess.   Nevertheless,  the 
usefulness  of  faces  for  inducing  good  performance  in  processing  Euclidean 
data  was  established  by  Experiment  2;  and  the  ability  of  the  transforma- 
tion discovered  in  Experiment  3  to  transmit  data  facially  without  training 
was  established  by  Experiment  5.   These  continue  to  suggest  that  an  im- 
proved version  of  Experiment  6  would  indicate  superiority  for  the  facial 
representation. 

TABLE  III.   Results  of  Experiment  6  —  32  Subjects 

Text  Faces 


Mean   correlation   score 

Standard   deviation 

Difference   from  chance  —    t(31) 

Difference   between  means   —   t(62) 
Paired   observations   difference   —   t(31) 

Score  using    text   algorithm  0.667 


The   Orthogonal   Subspace 

A   set   of   additional    stimuli  were  appended    to    the   facial   portion  of    Ex- 
periment   6    in  order    to    examine  another    idea.       (Since   these   followed   the 
regular    stimuli,    they   did   not   affect    subject's   responses   to   them.)      The 
facial   representation  of    the  MMPI   disorders   consisted   of   a   five-dimen- 
sional   subspace  of    the   18 -dimensional    space   that   represents  all   possible 
values   of    face  parameters.      There   remains  an  orthogonal   13-dimensional 
subspace   of    facial  variation.      Under    the   linearity  assumption  of   Experi- 
ment   3,    any  variation    in   this    subspace   should    have  no    effect   on   the  MMPI- 
related  meaning   of   a   facial    expression.      In   particular,    if   variation    in 
this   13-dimensional    subspace  were   superimposed   upon  a   face   that    lay  at    the 
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origin  of    the  five-dimensional   d    space    (i.e.,    one   that   was    the   facial   de- 
piction of   a   normal  MMPI    score),    all   of    the  resulting   faces   should   also 
depict   normal   MMPI    scores. 

Figure   13    shows    the   facial   representation   of    five  arbitrarily-selected- 
mutually-orthogonal   axes   lying    in   this   13-dimensional   orthogonal    subspace; 
they  are  plotted    in   the  manner   of    Figures   8   and    9,    with  one   row  of    faces 
representing  movement   along   one  axis   and    the   origin   at    the  center   of    each 
row.      Examination  of    faces    in   this    subspace   suggested    that    the  partition  of 
the  facial   variation   into    two   distinct   orthogonal    subspaces  might   be  a 
more    stable  one   than   the   further   partition  of    the  d    space   into    its   five 
MMPI-disorder   axes,    which  was   tested    in   the  previous   experiment.       (Both  of 
these  partitions  are  a  direct   result   of   the  regression  analysis  of   Exper- 
iment   3.)      It   appeared    that   the  two    subspaces  depicted   an   emotional   com- 
ponent  and   an   ident if icat ion  component   of    facial    expressions.      The   latter 
represents  variations  that   help  distinguish  the  faces  of   particular    indi- 
viduals  but    transmit   little   emotional   content. 

To    examine   this  notion,    10  additional   faces   representing   normal   MMPI 
scores   plus   random  variation    in  the  orthogonal    space  were   generated.    These 
faces  all   projected   onto    the  origin  of    the  d    space.      That    is,    while   they 
varied   along   five  of    the  axes   of    the  orthogonal    space,    they   contained   no 
variation  along   any  of   the   five  MMPI   disorder   axes.      If    the  d    space   indeed 
spanned  most   or   all   of    the  facial  variation  attributable   to   psychological 
disturbance,    these  faces   should   have   been   consistently   rated    "emotionally 
well,"    even   though   they   contained   considerable   "non-psychological"   varia- 
tion.     They   can   be  viewed   as   representing   psychologically-normal   people   of 
varying  physical  and   ethnic   characteristics.      In  order  not   to  affect   the 
subject's  mental   calibration  with  respect   to   the  possible  range  of  varia- 
tion of    the  faces,    five  psychologically-abnormal   faces  were   intermixed 
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FIGURE    13.       Facial    Representation   of    the   Axes    of    the    Orthogonal    Subspace 
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among    these   faces. 

The   results   for    this   added   portion  of    Experiment    6   were   straightfor- 
ward,   in   contrast    to    those  above.      If    the   three  points   on   the  rating    scale 
are  considered    zero    (normal),    one,    and    two,    the  mean   ratings   for   the   two 
sets   of    faces   are   1.575   for    faces    in   the  d    space  and    0.584    for    faces    in 
the  orthogonal    space.      Chance   responses   would   have  given  a  mean  of   one, 
so    the  observed    scores    showed    significantly    (jd^  0.0005)    more   psychological 
disturbance   than   chance   for    the   d    space  and    significantly   less   for   the 
orthogonal    space.      Thus,    the  variation    in   the  porportedly  non-psychologi- 
cal,   orthogonal    subspace   of    facial   variation  was    indeed   perceived   as   hav- 
ing  relatively   little  psychological  meaning.      This   suggests   that   a   reason- 
able partition   of    facial   variation    into   two   types  may   have   been  discovered; 
it   could    be  used    to   construct   displays    that    communicate    in  a    single   face 
both  psychological   and   non-psychological  data,    each   in    its   own   subspace, 
with  relatively  little  mutual    interference. 

IV.       CONCLUSIONS 
Two   principal   conclusions   are  drawn   from   this    study.      First,    computer- 
produced    faces   are  a   particularly   good   representation   for    inducing    superior 
performance   of   useful   tasks   on  multivariate  metrical   data.      Experiments 
with  other    iconic   and    symbolic   displays    indicate  that    it    is   the   facial   dis- 
play   itself,    not   merely   the    iconic  mode,    that   accounts   for    this    superior- 
ity.     Second,    the   stereotype  meaning   already   present    in   faces   can   be  util- 
ized   in   constructing   a   facial   display.      It   was   possible   to  measure  and    then 
exploit    such  meaning    in   order    to   create  a   demonstrably   self-explanatory 
display   for   a   particular    set    of   data. 


145 


ACKNOWLEDGMENTS 

Prof.  William  Huggins  was  the  author's  advisor  while  the  author  was  a 
graduate  student  in  Electrical  Engineering  at  the  Johns  Hopkins  University; 
he  provided  guidance  and  insight  throughout  this  work.   Profs.  Howard  Egeth 
and  William  Bevan  at  Johns  Hopkins  guided  the  work  on  the  first  set  of  ex- 
periments.  Drs.  Richard  Hsieh  and  Faith  Gilroy  of  the  U.S.  Public  Health 
Service  Hospital  in  Baltimore  provided  important  assistance  for  the  second 
set  of  experiments. 

This  research  was  supported  by  a  contract  between  the  Johns  Hopkins 
University  and  the  Engineering  Psychology  Programs,  Office  of  Naval  Re- 
search; and  by  the  U.S.  Public  Health  Service  Hospital  in  Baltimore, 
Maryland. 


146 


REFERENCES 

Anderson,    E.     (1960),    "A  Semigraphical   Method    for    the   Analysis   of    Complex 
Problems,"   Technometr ics,    2,    pp.    387-391. 

Arnheim,    Rudolf    (1969),    Visual   Thinking,    Berkeley:      University   of    Calif- 
ornia  Press. 

Chernoff,    Herman    (1971),    "The  Use  of    Faces   to   Represent    Points    in  n-Dimen- 
sional   Space   Graphically,"    Stanford   University,    Department    of 
Statistics,    Technical   Report   No.    71. 

,       (1973),    "The   Use   of    Faces   to   Represent   Points    in  n-Dimensional 

Space  Graphically,"  Journal  of  the  American  Statistical  Associa- 
tion, 68,  pp.  361-368. 

,  and  Rizvi,  M.  Haseeb  (1975),  "Effect  on  Classification  Error  of 

Random  Permutations  of  Features  in  Representing  Multivariate  Data 
by  Faces,"  Journal  of  the  American  Statistical  Association,  70, 
pp.  548-554. 

Egeth,  Howard  (1966),  "Parallel  versus  Serial  Processes  in  Multidimensional 
Stimulus  Discrimination,"  Perception  and  Psychophysics ,  1,  pp. 
24  5-252. 

Entwisle,  Doris  R.  and  Huggins,  William  H.  (1973),  "Iconic  Memory  in  Chil- 
dren," Child  Development,  44,  pp.  392-394. 

Garner,  Wendell  R.  (1974),  The  Processing  of  Information  and  Structure, 
Potomac,  MD:   Lawrence  Erlbaum. 

Harrison,  Randall  P.  (1964),  "Pictic  Analysis:   Toward  a  Vocabulary  and 

Syntax  for  the  Pictorial  Code  with  Research  on  Facial  Communica- 
tion," unpublished  Ph.D.  thesis,  Michigan  State  University. 

Hathaway,  S.R.  and  McKinley,  H.C.  (1942),  The  Minnesota  Multiphasic  Per- 
sonality Schedule,  Minneapolis:   University  of  Minnesota  Press. 

Huggins,  William  H.  and  Entwisle,  Doris  R.  (1974),  Iconic  Communication: 
An  Annotated  Bibliography,  Baltimore:   The  Johns  Hopkins  Press. 

Jacob,  Robert  J.K.  (1976a),  "Computer -Produced  Faces  as  an  Iconic  Display 
for  Complex  Data,"  unpublished  Ph.D.  thesis,  Johns  Hopkins  Uni- 
versity. 

,   (1976b),  "PLFACE  Program,"  available  from  the  author  upon  re- 
quest . 

,   Egeth,  Howard,  and  Bevan,  William  (1976),  "The  Face  as  a  Data 

Display,"  Human  Factors,  18,  pp.  189-199. 

McKelvie,  Stuart  J.  (1973),  "The  Meaningfulness  and  Meaning  of  Schematic 
Faces,"  Perception  and  Psychophysics,  14,  pp.  343-348. 


147 


Mezzich,    Juan   E. ,    and  Worthington,    David   R.L.    (1978),    "A  Comparison  of 

Graphical  Representations  of  Multidimensional  Psychiatric  Diag- 
nostic Data,"  in  Graphical  Representation  of  Multivariate  Data, 
ed.    Peter    C.C.   Wang,    New  York:      Academic   Press,    pp.    123-141. 

Morrison,    Donald   F.    (1967),    Multivariate   Statistical  Methods,    New  York: 
McGraw-Hill. 

Reed,    Stephen  K.    (1972),    "Pattern  Recognition  and    Categorization,"    Cogni- 
tive Psychology,    3,    pp.    382-407. 

Rome,    H.P.,    Swenson,    W.M. ,    Mataya,    P.,    McCarthy,    C.E.,    Pearson,    J.S.,    and 
Keating,    R.F.    (1962),    "Symposium  on  Automation   Techniques    in 
Personality   Assessment,"   Proceedings   of   the  Mayo   Clinic,    37, 
pp.    61-82. 

Secord,    Paul   F. ,    Dukes,    William   F. ,    and   Bevan,   William    (1954),    "Personal- 
ities   in  Faces:      I.      An  Experiment    in   Social  Perceiving,"   Genet ic 
Psychology  Monographs,    49,    pp.    231-279. 

Siegel,    J.H.,    Goldwyn,    R.M.,    and    Friedman,    H.P.    (1971),    "Pattern  and   Pro- 
cess  of    the  Evolution  of   Human  Septic    Shock,"    Surgery,    70,    pp. 
232-245. 

Smith,    E.E.    and   Nielsen,    G.D.    (1970),    "Representations  and   Retrieval   Pro- 
cesses  in  Short-Term  Memory:      Recognition  and   Recall   of   Faces," 
Journal  of   Experimental  Psychology,    85,    pp.    397-405. 

Tatsuoka,   Maurice   (1971),    Multivariate  Analysis:      Techniques   for   Educational 
and   Psychological   Research,    New  York:      John  Wiley. 

Yin,    Robert   K.    (1969),    "Looking   at   Upside-Down   Faces,"    Journal   of   Experimen- 
tal Psychology,    81,    pp.    141-145. 

,    (1970),    "Face  Recognition:      A  Special  Process?"  unpublished  Ph.D. 

thesis,    M.I.T. 


148 


UN  TEST   DE   BASE   DE   LA  GRAPHIQUE 
Jacques   Bertin 

Toutes   les   disciplines  ont   plus   ou  moins  recours  à   la   graphique   et   dans 
tous   les   cas,    le  même   problème   se  pose:      comment   dessiner    les  données? 
J'ai  cherche  a   repondre  a   cette   question   en   esquissant    les   principes   de   la 
Graphique   et   da    sa    sémiologie.      Et   cependant    les  cartes   et    les  diagrammes 
inutiles   sont    encore   très   nombreaux.      Il    f aillait    trouver    quelque  chose  de 
plus   simple  pour   éviter   ces   erreurs.      Dans  cette   intention,    voici  un  test 
facile  à   appliquer    et    qui   est   particulièrement   destiné  aux   responsables 
d'une  recherche  ou  d'une  publication  et  a   tout  utilisateur   ou   lecteur  de 
diagrammes   et  de  cartes. 

A  l'origine  de  ce  test,    deux  observations: 

Tout   diagramme,    toute  carte   est    la    transcription  d'un   tableau   de 
données  a  double  entrée. 


L'objectif   d'une  transcription  graphique  est  de  comprendre,    c'est- 
à-dire  de  réduire   la  multitude  des   données   élémentaires  aux   groupe- 
ments^ que  l'ensemble  des  données  construit. 


En  conséquence,  un  diagramme  ou  une  carte  doit  fournir 

une  réponse  vis- 

uelle  aux  deux  questions  suivantes: 

1.   Quelles  sont  les  composantes  x  et  y  du  tableau  des 

donnees? 

2.   Quels  sont  les  groupes  d'éléments  x,  les  groupes  d 

'éléments  y  que  les 

données  construisent? 

Ces  deux   questions  constituent   ce   qu'on   peut   appeler    le   test   de   base  de 
la   Graphique.      Il   permet   d'éclairer    le   "comment   dessiner"   par   une  reflexion 
sur    le   "pourquoi  dessiner"    en   introduisant    la  notion  de   questions   pertinentes 
et   de   leur   hiérarchie  depuis   les   questions   élémentaires   jusqu   aux   questions 
essentialles.      Il  montre   que  ces  dernières   n'ont    qu'une   seule   solution  gra- 
phique. 
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Il  n'y  a  pas  de  bons  ou  de  mauvais  diagrammes,  de  bonnes  ou  de  mauvai- 
ses cartes.  Il  y  a  des  constructions  qui  réponse  et  d'autres  qui  ne  re- 
pondent  pas  aux   questions   qu'on   est   en  droit   de   poser. 

En   faisant   apparaître   la   hiérarchie  des   questions   possibles,    ce   test 
souligne   qu'on  ne  regarde  pas  un  graphique  ou   une   carte  comme  on  regarde 
une  peinture  ou  un   signal   routier.      On  ne   "lit"   pas  un  graphique;    on   lui 
pose  des   questions.      Et    il   faut    savoir   poser    les   questions  utiles. 

Ce   test   définit   les   questions   essentialles   et    il   permet,    devant   toute 
construction  graphique,    de   porter  un  judgement    immédiat    et    indiscutable,    et 
de  découvrir    souvent   des   erreurs    incroyables. 

Ce  test  permet  d'éviter   les   questions  non-pertinentes.      De  plus   en   sou- 
ligant    les  deux   temps  de   la   perception   graphique,    les  deux   questions   du 
test  montrent   que   la   perception  graphique  n'est   pas  regie  par    la   théorie  de 
la   communication   et    que   les   tests   du   genre:      Que  voyez 'vous?      Que  préférez  - 
vous?      Sont    sans   rapport   avec    l'objectif   de   la   graphique   et   deviennent 
source  de  confusion  et  d'erreurs. 

Par    sa    simplicité   et   par   les  développements   qu    il   autorise,    ce   test 
apparait   d'une   efficacité  jusque'à   present    sans   égale,    au   bénéfice  de  la 
graphique   et    sans  doute  aussi  de   la   logigue   et   de   ses   langages.      Quelles   en 
sont    les  applications,    et   tout   d'abord   comment    se  justificent   les  deux  ob- 
servations  préliminaires? 

I.       DEUX   OBSERVATIONS   PRELIMINAIRES 
A.      Tout    "graphique"    est    la   transcription  d'un   tableau   de  données. 

Une   "donnée"    est    la   relation   entre  deux   elements.      Soit   par    exemple 
la   donnée   suivante:      "Monsieur   M  a    25   ans".      Cette  assertion   établit   une 
relation   entre   l'élément   M  d'un   ensemble  d'individus   et   l'élément    25  d'un 
ensemble  d'âges.      Un   ensemble  de  données    construit    les  relations   qui 
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existent    entre  un   ensemble  d'éléments   appelés   "objets"    et   un  autre   ensemble 
d'éléments  appelés   "caractères",    attribues  à  ces   objets.      Tout    ensemble  de 
données   peut   donc    être  construit    sous   la   forme  d'un   tableau    qui  place   en  x 
la   composante  dite   "objets"    et    en   y   la   composante  dite   "caractères".*      Les 
cases  du   tableau  ainsi  constitué  notent   la  relation  observée  entre  chacun 
des   elements  de  x   et   chacun  des   éléments   de  y.      Cette  notation   est    le  z 
d ' image.** 

Comment   entre-t-on  des  données  dans   l'ordinateur?      A  l'aide  d'un  bor- 
dereau,   c' est-a-dire  d'un   tableau  a  double  entree!      Par   quoi  commence   la 
cartographie?      Par    le   travail   du    topographe   qui   établit   un   "carnet   de   points", 
c'est-a-dirè  un  tableau  à  double  entree!      Notons   enfin  que  tout  reseau  de 
relations  peut   être  lui  aussi  construit    sous   forme  de  tableau  à  double  en- 
trée. 

Si   l'on  admet   que   les    entrees  x    et   y  du   tableau   ne   sont   pas   limitées 
en  nombre  d'éléments,    on  peut  donc   facilement    imaginer    (mais  pas  forcement 
construire)    tout   problème   sous  cette  forme.      Tout   graphique  et   toute  carto- 
graphie  est   donc    bien   la    transcription  d'un   tableau  à  double   entrée,    fut-il 
très  grand. 


*  Ou   l'inverse.      Mais  généralement   les  objets  peuvent   être  numérotés  tandis 
que  les  caractères  doivent   être  définis.      Ces  définitions   sont   plus   lisibles 
lorsqu'elles   sont   horizontales  c 'est-a-dire   quand   les  caractères   sont   en  y. 

**   Cette  analyse  comporte  une  obligation  absolue:      les  relations  ne  peuvent 
■être   exprimées   que  par  oui  ou  non    (1  ou    0)  ,    par  un  ordre    (le,    2e,    3e...), 
par  des   quantités,    par?      (absence  de  données)    ou   par   "non  lieu".      Toute 
autre  notation  est   exclue. 


151 


B .      Tout   graphique  a   comme   objet   de  réduire   les   entrees   du   tableau  des 
données. 

Les   données,    c ' est-a-dire   les   observations   que   l'on  peut   faire   sont 
toujours  multitude.      Decider   c'est   choisir,    mais  au  moment   de   la  decision 
nous  ne  pouvons   pas   retenir    et   prendre   en  compte  cette  multitude  avec    la 
rigueur   nécessaire.      Il   nous   faut   donc   réduire  cette  multitude  cest-a-dire 
découvrir   des   éléments    semblables,    les   grouper,    les   classer.      C'est   a   ce 
prix    que   l'on  peut   comprendre   et   decider.      Rappelons   la   proposition  célèbre: 
comprendre  c'est   catégoriser    et   plus   précisément,    c'est   réduire   la   totalité 
des  données   prises   en  compte  aux   groupements   que   les   relations  construisant. 

C'est   par   consequent   faire  apparaître  dans  un   tableau   de  données: 

-     les  groupes  d'objets 

les  groupes  de  caractères 
que   les  relations   z   construisent.      Les   relations   z    sont   les   chiffres  du 
tableau.      Dans   les  matrices   graphiques,    ce   sont   les  variations  du   blanc   au 
noir  correspondent   a  ces  chiffres. 

Prenons   l'exemple   suivant,    volontairement   aussi   simple   que  possible. 

En  1966,    les   cinq  ministres  de   la   CEE    (Communauté  Economique   Euro- 
péene,    appelée  couramment   "marché  commun")    se  réunissent    sur   les  problèmes 
du  marche  de   la  viande.      Ils  disposent   de  nombreuses   statistiques.      Pour 
en   faciliter    l'utilisation,    l'administration  de   la    CEE   fait   construire  des 
diagrammes. 
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Le  diagramme    (1)    est    la   transcription  des   données    (2). 

On  constate   que   les  utilisateurs   lui  préfèrent    le   tableau   des  données  car 
(1)   ne  permet   de  voir   que  l'information  élémentaire.      Et   celle-ci  est   plus 
lisible  en    (2).      La  construction  normale  d'un   tableau  de  données   est   la  ma- 
trice   (3).      Elle   fait   apparaître  l'information  d'ensemble  c'est-a-dire   les 
groupements  de  pays    (4,    5   et    6)    et   les   structures  de  production    (7   et   8) 
qui  caractérisent  ces  groupements. 
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Combien  de  temps  faut-il  pour  voir  de  quoi  il  s'agit  c 'est-a-dire  pour  dé- 
couvrir la  nature  des  composantes  x  et  y  du  tableau  des  donne'es?  Il  n'est 
pas  possible   ici  de  poser  une  question  pertinente,    fut-elle  élémentaire! 
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Regardons   le  diagramme    (1)    construit   par   cette  administration.      Quelle 
information   permet-il  de   "voir"?      Bien  peu   de  choses   en  vérité'  et   l'on  con- 
state  que   les   responsables   lui  préfèrent    le   tableau   des  chiffres    (2).      C'est 
qu'inconsciemment    sans   doute,    ils   ont   perçu    l'absence  d'intérêt   des  réponse 
fournies   par    le  dessin.      On   y  voit   par    exemple   que   69%   des   ovins    sont    en 
France,    ou    encore   que   la   France  produit   plus   que   la   Belgique!      Etait-il   besoin 
de   faire  un  dessin   pour    savoir    que   la   France   est    16   fois  plus   grande   que   la 
Belgique? 

La  construction  normale    (3)    par  contre  fait   apparaître  le  contenu  réel 
du   tableau  des  données,    et    il   est  d'importance:      l'Allemagne  et   les  Pays-Bas 
ont   le  même   structure  de  production  fondée   sur   le  porc    et   le  bouef.      Ces 
deux  pays  forment   donc  un  groupe    (4),    opposé  à  un  autre  groupe,    celui  de  la 
France   et   de  l'Italie,    caractérisé  par  une   structure   inverse    (6).      Enfin   la 
structure  de  production  de  l'Union  Belgo-Luxembourgeoise    (5)    est  différente 
de  deux   précédentes.      Il   en  résulte   que,    dans   le  cadre  de  ces   données,    les 
politiques  des  deux   premiers   groupes   ne   peuvent    être   qu'opposées    ...    ou 
complémentaires   et    que,    s'il   y   a  vote   et    equivalence  des  voix,    la  decision 
appartient   a   l'Union  Belgo-Luxembourgeoise. 

C'est    en  découvrant    que   les   25    données   élémentaires,    (5x5)    réduisaient 
les  pays  a   trois  groupes    (4) ,    (5)    et    (6)    définis  par  deux  groupes  de  prod- 
uits   (7)    et    (8)    que   le   tableau   des  données   a    fourni   l'information   essen- 
tielle,   a   partir   de   laquelle   toute  donnée   élémentaire   s'inscrit    soit   comme 
témoin  de  cette   information   soit   comme  exception. 

Et  ce  résultat  ne  depend   pas  de  la  dimension  réduite  de   l'exemple.      En 
effet,    que  le  tableau   soit  de   5x5,    c' est-a-dire  de  25  données   élémentaires, 
de  50x50,    ou  de  1000x1000  c 'est-a-dire  de  un  million  de  données   élément- 
aires,   le  problème  est   toujours   le  même.      Il    s'agit  de  comprendre,    et   l'on 
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sait   que   l'homme  n'intègre   tout   au   plus   qu'environ   7    concepts  combinatoires 
autour   d'un  même  problème.      Il    s'agit   donc   de  ramener    tout   tableau,    tout    en- 
semble  informationnel   a   ce  nombre  accessible  de  concepts. 

Il    s'agit   donc    bien  de  réduire   le   tableau   aux   groupements   ou   aux  ordres, 
en  x   et    en  y,    que   les   relations   z   construisent.      C'est    l'objet   des   traite- 
ments  statistiques   et   particulièrement   des   traitements   "multivariés" .      C'est 
l'objet   de   toute   transcription  graphique. 

II.       APPLICATION  AUX  DIAGRAMMES 
L'exemple   très   simple  que  nous  venons  de  voir   permet   d'éclairer    les 
principes  de   base  de   la   perception   et   de   la   construction  graphique.      L'infor- 
mation  essentielle   est   bien  de   la   forme  définie   par    la  deuxième   question: 
quels   sont    les   groupements   en  x?      Réponse:         (4),    (5)    et    (6).      Quels    sont 
les   groupements   en   y?      Réponse:         (7)    et    (8) .      Par   consequent 
La   connaissance  des  x   et   des   y   est   la  condition  première. 

0    ,    peut-on,    devant   le  dessin    (9)    par    exemple,    découvrir   de   quoi   il    s'agit? 
Combien  de   temps   le   lecteur   doit-il  mettre  pour   définir    les   entrées  x   et   y 
du    tableau   qui  a    servi  a  construire  ce  dessin?      Comment   le   lecteur   peut-il, 
dans  ces   conditions,    voir    l'information   que  ces  données  contiennent?      A  l'év- 
idence toute   "lecture"   graphique  utile  commence  par    la   connaissance  de  la 
nature  des   entrées  x   et   y  du   tableau   des  données. 

Alors   pourquoi,    comme   ici,    détruire  graphiquement   ces   entrees?      Alors 
pourquoi,    comme  dans   de   trop   nombreux   cas,    écrire  ces   entrées   en   lettres 
de  moins   d'un  millimètre,    ou  même   les   oublier?      La   definition   écrite  de  ces 
entrees   est    le   premier    temps  de   la   perception  graphique.      C'est    le  vrai 
titre  d'un  diagramme. 

Ecrire  de   façon   très  visible  la   definition  des   objets   et  des  caractères 
est    la   premiere  regie  de  construction  des   diagrammes. 
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Tout    graphique  utile  doit   repondre 
test.      Arme  de  cette   question,    le   lecteur   peut   porter   un   premier   judgement 
devant    tout   diagramme,    et   ce  judgement   risque   trop    souvent   encore  d'etre 
négatif . 

Savoir   définir   les   questions   pertinentes   et    les   hiérarchiser. * 

Lorsque   le   lecteur   a   pris   connaissance    (ou   a   pu   prendre  connaissance) 

des    entrees  du   tableau   des  données,    il   est    en  mesure  de  définir    la   totalité 

des   questions   pertinentes   a   ce   tableau. 

Il   constate   tout   d'abord   qu'il   y  a   deux   types  de   questions: 

-  les   questions   introduites  par  x:      tel  pays,    combien?      et 

-  les   questions    introduites   par   y:      tel   produit,    combien? 
Cette  observation  prendra   tout    son   sens   en  cartographie. 

Il  constate  ensuite  qu'il  peut  envisager  dans  chaque  type  soit  des  él- 
éments, soit  des  sous-ensembles,  soit  des  ensembles,  et  qu'il  peut  par  con- 
sequent  poser   trois  niveaux  de  questions: 

-  des  questions   élémentaires:    tel   pays,    tel  produit,    combien? 

-  des   questions  moyennes:    tel   pays,    quels    sont    tous    ses   caractères? 

-  une   question  d'ensemble:      comment    se  regroupent    les   pays? 

Cette  analyse  par   type  et  par  niveau   définit   la   totalité  des  questions  per- 
tinentes  et    l'on  constate  alors   que   les  constructions   telles   que    (1)    sont 
inutiles   parce   qu'elles   ne  répondent    qu'aux   questions   élémentaires:    tel 
secteur   de  cercle,    c' est-a-dire   tel   produit,    tel   pays,    combien? 

En   effet,    ces   questions   élémentaires   sont   multitude.      Et    l'objet   de 
la   graphique,    comme  celui  de   la  mathématique,    n'est    pas  de   représenter   cette 


*  Les   "hypothèse"   ne   sont    simplement   qu'un  choix  de   questions   pertinentes, 
En  principel,    elles  précèdent   le  tableau  des  données   et   permettent  donc 
de  le  concevoir. 
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multitude  non  memorisable  mais,  au  contraire,  de  se  battre  contre  elle  et  de 
découvrir  les  relations  d'ensemble,  seules  memorisables .  Le  graphique  utile 
doit  donc  fournir  une  réponse  visuelle  a  la  question  d'ensemble  c' est-a-dire 
re a   la   deuxième   question-test. 

Savoir   définir    la   construction  utile  c    est-a-dire   la   construction  normale. 
Comment   repondre  a   la   deuxième   question-test    et   découvrir    les   groupe- 
ments  en  x   et    en   y   lorsque   le  dessin  desagrège   les    entrees  du   tableau  comme 
en    (1)    ou   en    (9)?      Bien 
qu'à  deux   conditions,    qui  définissent    la   construction  normale: 

(1)  La   construction  normale  conserve   la    structure  matricielle  du   tableau 
des  données.      Elle   place  respectivement    sur   l'x   et    sur   l'y  de   la   feuille  de 
papier    les   entrees  x   et   y  du    tableau   des   données   et   transcrit    les   z    (rela- 
tions  la  valeur   ou   la    taille  des   taches.      C'est   la   deuxième  règle  de  con- 
struction des  diagrammes. 

(2)  La   construction  normale  reclasse   les   lignes   et/ou    les   colonnes  de   la 
matrice  pour   faire  apparaître   les   groupes.      C'est    la   troisième  règle  de  con- 
struction des  diagrammes. 

En  effet   les   lignes   et   les  colonnes  de    (3)    ne   sont  pas  dans   le  même 
ordre  que  dans   le  tableau    (2).      C'est   en  permutant   les   lignes   et   en  rapproch- 
ant   les   lignes    semblables,    c'est    en  pumutant,    s'il   y  a   lieu,    les   colonnes 
et    en  rapprochant   les  colonnes   semblables   que   l'on  découvre   les   groupes, 
c' est-a-dire   l'information  d'ensemble. 

Ces   permutations   sont   faciles.      Un   enfant   les   conduit   naturellement. 
Elles    surprennent    quelquefois   les   adultes   qui   ont   progressivement   perdu 
l'habitude  de   "VOIR"   des   ensembles  au   cours  d'une   scolarité'  fondée   sur 
l'élément,    sur    le  mot,    sur   le  chiffre,    sur   le   signe   et   dans   laquelle   le  pro- 
fesseur  de  dessin  est    surtout  un   esthéticien.      Mais   le  problème  des 
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permutations   est   d'abord    pratique.      On   peut   opérer   par   simple   découpage  du 
desssin.      Lorsque   les   données   sont   nombreuses,    il    existe  des  matériels    spe- 
cialises   qui   permettent   de   procéder    facilement   a   ces   permutations.* 

Profiter   des  propriétés  de   la  construction  normale. 

a   -  La  construction  normale  repond   à   toutes   les   questions  pertinentes. 
"Ce   qu'il   y  a  de   bien,    c'est    qu'on  voit    tout"   disent    les  utilisateurs.      Que 
veulent-ils  dire   exactement?      Tout    simplement    que   la   construction  normale 
repond   a   la   fois   aux   questions  d'ensemble   et   aux   questions    élémentaires, 
ce   qui  rend   toute   information   élémentaire  utile  car   on  voit    instatanement 
si   elle   est   representative  de   la   tendance  générale  ou,    au   contraire,    si  c'est 
une  exception,    ce   qui  oriente  la  réflexion  et   la  recherche.      On  découvre 
ainsi   que   la   graphique   est    le   seul   "langage"   qui  permet   d'aller    instantané- 
ment  de   l'ensemble  au   detail   et   du   détail   a    l'ensemble   et   de  juger   de   tout 
élément.      Mais   ce  n'est   vrai   que   pour    la   construction  normale.      Toute  autre 
construction,    et   par    exemple   les   constructions    (10)    qui   "représentent"    les 
mêmes  données   que    (2)    ne  permettent   pas   d'aller   à   l'ensemble.      Elles  ne  res- 
pondent   qu'aux   questions   élémentaires,    sans   qu'il   soit   possible  de   porter 
un  judgement    sur   ces   éléments   d'information.      Si   la   graphique   est   déconsid- 
érée,   c'est   principalement    en  raison  de  l'inutilité  de  ces   constructions. 


Cf.    (3)    p.    35, 
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Comme   le  diagramme    (1)    ces   quatre  constructions    "représentent"   les   données 
du   tableau    (2) .      Aucune  ne   permet   de  voir   comment    se  regroupent   les  x   et 
les   y  du    tableau   des   données.      Ces   constructions  classiques    sont    en   fait 
inutiles.      Et    elles  ne   sont   pas   plus  utiles   lorsqu'elles    sont   distribuées 
sur   une  carte    (cartogrammes)    car    le  minimum  d'ordre   qui   subsisle  dans   ces 
diagrammes   est   détruit    sur    la   carte. 

b   -   La   construction  normale   évite   les   questions  non-pertinentes.      "Si 
l'on  avait   pris  d'autres   caractères,    les   groupements    seraient   différents". 
Cette   observation,    malheureusement   courante,    est   probablement    exacte.    Mais 
il    s'agit   d'un   tout   autre   problème  défini  par    la   question   "quel    tableau 
des  données   faut-il   construire?"      Cette   question   est    extérieure  au   tableau, 
Elle  n'est   pas   pertinente  au   tableau.      Ne   pas  mélanger   deux  moments   dis- 
tincts de   la   reflexion:      choix  des  données,    traitement   des   données,    c'est 
une  règle   essentielle  de   la    logique   que   la   construction  normale  met   partic- 
ulièrement   en   evidence. 

D'ailleurs,    une   telle   observation  n'aurait   pas   pu    être   faite  devant 
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le  dessin  (1).  On  découvre  ainsi  que  ce  sont  les  résultats  due  traitement, 
c'est-à-dire  les  groupements,  qui  autorisent  ces  observations  et  permettent 
d'orienter    la   recherche  de  nouveaux   caractères. 

La   graphique   peut   aider   à   repondre   à   la   question:      quel   tableau   des 
données   faut-il   construire?      C'est   ce   qu'on  appelle   "lanalyse  matriceielle 
d'un  problème"    et   c'est   une   opération  de  nature   très  diff trente. * 

c    -   La   construction  normale  démystifie   l'ordinateur. 

-  J'ai  passe  mon   enquête  a   l'ordinateur! 

-  Mais   quelles   questions   avez-vous   "posé  a    l'ordinateur"? 
Combien  de  chercheurs   peuvent-ils   repondre  avec    précision   et    simplicité? 
Toute   enquête,    toute   etude  n'étant    en  réalite   que   la   résolution  d'un   et   d'un 
seul   tableau   de  données    (qu'il   faut    évidemment    savoir    imaginer),    les   deux 
questions-test    et   la   liste  des   questions   pertinents   permettent   de  donner 
une  réponse      precise   et   de  définir   de   façon  claire   toutes   les  modalités 
éventuelles  du    traitement.      La   graphique  donne  ainsi  une   forme  visible  à   ce 
qu'on  appelle   "le   traitement   des   données".      Il    s'agit    toujours   de  découvrir 
les   groupements   en  x   et    en   y   que   construisent    les   relations   z   d'un   tableau 
défini.      Les   traitements   graphiques    et    les   traitements  mathématiques    "multi- 
variés"   complètent   ainsi   la    statistique  classique   qui  calcule  un  coefficient 
de  correlation  ou   une   loi  de  correspondance   entre  deux   lignes   d'un   tableau. 

Passer   de   spectateur   a   acteur 

S'intéresser   a  un  problème   et   comprendre,    c'est   passer   de   la   lecture 
élémentaire  à   la   lecture  d'ensemble.      La   graphique  a   pour    but   d'autoriser 
ce  passage.      La   graphique  n'est   pas  un  art.      Contrairement   au    "graphisme", 
la   graphique   est   un   langage  regour eu sèment   fini  dans    ses  moyens    (c'est    le 
seul)    et    qui  opère   sur   des   ensembles   rigoureusement   définis.       Il   n'est   donc 


*   CF.     (3)    p.    233. 
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pas  régi  par    la   "théorie  de   la   communication"   ni  par   la   psychologie. 

In  effet,  percevoir  une  affiche,  un  signal  routier,  un  mot,  ne  demande 
qu'un  seul  temps  de  perception:  de  quoi  s'agit-il?  Percevoir  un  graphique 
demande,    comme  nous  venons  de   le  voir,    deux   temps   distincts  de  perception: 

(1)  De   quelles   composantes   s'agit-il? 

(2)  Quelles    sont   les  relations   entre   ces  composantes? 

Le   premier    temps   est   conventionnel.      Il    s'agit   d'isoler,    parmi   le  nom- 
bre  illimite  des  concepts    imaginables,    un  ou   deux   concepts  donnes,    et   par 
exemple  de  découvrir    que  dans   le  dessin    (1)    il    s'agit   de  cinq   pays   et   de 
cinq   types  de  viandes.      Devant   l'inifini  des   possibles,    les  conventions 
verbales   ou   figuratives   par    lesquelles    il   faut   passer   offrent    toujours  di- 
verses  possibilités   d'interprétation.      Le   premier   temps   est  donc   régi  par 
le   schema  classique  de   la   communication  polysémique:      Emetteur-* — >»Code-* — >- 
Récepteur. 

Le  deuxième   temps  n'est   pas  conventionnel.      Nous   ne   sommes   plus  main- 
tenant  devant   un  nombre   illimite  de  concepts,    mais    seulement   devant    trois 
concepts:      les   trois   relations   fondamentales   auxquelles   toute  observation 
peut    se  réduire: 

-  relations  de  ressemblance  ou  difference    (^) 

-  relations  d'ordre    (0) 

-  relations  de  proportionnalité    (Q) 

Et  la  graphique  n'est  pas  conventionnelle  puisqu'elle  transcrit  une 
relation  par  le  même  relation.  Elle  transcrit  une  ressemblance  entre  des 
choses  par  une  ressemblance  visuelle  entre  des  signes  ou  entre  des  posi- 
tions (elles  sont  proches) .  Elle  transcit  un  ordre  entre  des  choses  par 
un  ordre  visuel  entre  des  signes  ou  entre  des  positions.  Elle  transcrit 
une  proportion   entre  des  choses   par  une  proportion  visuelle   entre  des 
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signes   ou   entre  des   positions. 

Dans   le   2e   temps  notre   oeil   ne  regarde   pas   la    signification  d'un    seul 
signe    (qui   est    toujours  discutable).      Il   regarde  ce   qui  varie  d'un   signe 
a   l'autre.      Il   n'utilise   que   la   variation  visuelle   entre   les    signes    (qui 
est    indiscutable) .      En   consequence,    transcrire  un   ordre  par  une  ressemblance 
n'est   pas   prendre  une  convention.    C'est   construire  de   faux   groupements   et 
par   consequent    faire  un  mensonge.      La   transcription   graphique  n'est   donc 
pas   libre.      Mais  c'est    pour   cela    qu'elle   est   universelle. 

Emetteur    et    Récepteur    sont    lies   par    le   tableau   des   données.       Ils    sont 
exactement   dans   la  même   situation.      Emetteur    et    Récepteur    sont   des   "acteurs" 
qui  posent    la   deuxième   question-test    sous   la   forme:      quels    sont,    dans    le 
tableau,    les   proportions   et    les   ordres   et,    en  definitive,    quels    sont    les 
groupes    (resemblances)    construits  par   les  données? 

Dans   le  deuxième   temps  de  perception,    rédacteur,  et   "lecteur"    suivent   le 
schma  monosémique:      acteur-* >-trois  relations    (Q,    0,    ^)  . 

Ce   schema    souligne   que   les   tests   classiques   du   genre"que  voyez-vous 
sur   ce  diagramme,    sur  cette  carte,    quelle  carte,    quelle  couleur   preferez- 
vous?"   considèrent    en   fait    qu'on  regarde  un   graphique  comme   on  regarde  une 
peinture.      Applications   de   la   théorie  de   la   communication,    ils   ne   concernent 
que  le  premier   temps  de  la   perception  graphique.      Ils  ne  fournissent   pas   le 
moyen  précis   et  concis  de  définir    indiscutablement   le  "pour-quoi"   et   par 
consequent   "le  comment"   du 'un  diagramme  ou   d'une  carte. 

Le   schema  de   la   transcription  monosémique   est    en   quelque   sorte   la 
forme  canonique  des  deux  questions-test.      Avec   elles,    il  construit  un   in- 
strument d'analyse  qui  permet  de  déceler   et  d'éviter   les  principales   erreurs, 
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Savoir    éviter    les   principales   erreurs 

1ère   erreur   -  Ne  pas  mettre   en  vedette   les   entrees  x   et   y  du    tableau. 

Les   entrees  du    tableau    sont   le   seul  moyen  de   savoir   de   quoi   il    s'agit. 
Elles  doivent   donc   apparaître   instantanément.      Dans  une   écriture   très  visible 
elles  doivent   figurer   a   leur   place   sur    les   entrees  de   la  matrice  graphique. 
2e   erreur   -   Détruire   les   entrees  du   tableau. 

La   construction  normale  conserve   la   structure  x,    y,    z   du   tableau  des 
données.      Toute  autre  construction  détruit    les   entrées   et   ne  repond   qu'aux 
questions   élémentaires  ou   a   certaines   questions  moyennes. 
3e  erreur   -  Ne  pas   faire  apparaître  les  groupements. 

Il  ne   suffit   pas  de  conserver   la    structure  du   tableau.      Il   faut    faire 
apparaître   les    similitudes   c'est-à-dire  rapprocher    les   lignes    semblables 
et    s'il   y  a   lieu,    les  colonnes    semblables.      Ce   sont   ces   permutations   qui 
font   apparaître  les   groupes   en  x    et    les   groupes   en  y. 
4e   erreur   -   Prendre  une  convention   et    transcrire  un  ordre  par  un  désordre. 

Dans   le  plan  x,    y,    c'est   précisément   faire   les   erreurs   2    et   3   ci-dessus. 
Dans   la   troisième  dimension   z   de   l'image  c'est   par    exemple   transcrire  une 
progression   par  un  desordre  de  valeurs,    ce   qui  arrive   souvent   avec    la  cou- 
leur.     L'oeil   perçoit   alors  de   faux   groupements  visuels!      Cette   erreur    est 
particularliérement   fréquente  dans   les  cartes  a  un   seul   caractère.      Les 
cartes   fournies    sur    écran  cathodique  au   President   des   Etats-Unis  d'Amérique 
ont   de  merveilleuses  couleurs.      Mais   les  niveaux  de  valeur   ne   suivent   pas 
les   niveaux   quantitatifs!      Le  President  voit   donc   de   faux   groupements,    de 
fausses  geographies! 
5e   erreur    -  Prendre  une  convention   et    transcrire  un   ordre  par  une  difference. 

C'est   par    exemple   transcrire  un  ordre   par   une  variation  de   forme.    Les 
groupements  disparaissent.      Il   n'y  a   que   quartre  variables  visuelles  ordon- 
nées:     les  deux  dimensions  du   plan,    la    taille   et   la  valeur. 
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Une   proportion    (Q)    ne   peut    être   transcrite   que  par    le   plan   et    la   taille, 
Un  ordre    (0)    ne   peut    être   transcrit   par    le   plan,    la   taille   et    la   valeur. 
Les  autres  variables:      grain,    couleur,    orientation   et    forme  ne   sont   pas   or- 
données.     Elles   ne  construisent   pas   de  groupes.      Elles  ne   séparent    que   les 
informations   élémentaires. 

Elles   peuvent    quelquefois    souligner   des   groupements,    mais   seulement 
quand    la  distribution  dans   le   plan   est    estrement    simple,    en  d'autres   termes 
quand    les   elements    sont   déjà   groupes  dans   le  plan. 
6e   erreur    -  Multiplier    les   diagrammes   partiels. 

C'est   l'erreur   de   base.      Elle  détruit    la   vision  d'ensemble  du   problème 
traité.      Une   étude   est   un   tout.      C'est   donc    un   seul   tableau   qu'il    faut    sa- 
voir   imaginer    et    qui   seul   peut   justifier    les    sous-ensembles   traités    séparé- 
ment. 
7e   erreur   -      Dessiner   uniquement   pour    la   publication. 

Une  construction   graphique  n'est    pas   faite   pour   être  publiée.      C'est 
d'abord   un    instrument   de   travail   personnel,    qui  permet   de   traiter   l'infor- 
mation  et   de  découvrir    les   groupements   que   les   données  contiennent.      La 
publication  vient   après.      Et    l'on  ne  publie   que   ce   qui   est   nécessaire   et 
suffisant . 

III.      APPLICATION  A    LA   CARTOGRAPHIE 
Toute  carte  est  la  transcription  d'un  tableau  a  double  entree. 
La  carte  (12)  est  la  transcription  du  tableau  (11) .   Dans  le  carte, 

les  départements  sont  disperses  dans  le  plan.   Dans  le  tableau,  ces  mêmes 

départements  sont  alignes  en  x  et  les  caractères  en  y. 

Quel  que  soit,  dans  la  carte,  le  nombre  des  elements  informes:   90 

départements  ou  90  millions  de  points,  on  peut  les  imaginer  alignés  en  x 

dans  un  tableau  qui  transcrit  en  y  les  caractères  observes.   Tout  problème 
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cartographique  peut   donc    être   imagine  au  départ   comme   la   transcription  d'un 
tableau   comportant   en  x   les   points   géographiques   et    en  y   les  caractères. 

En  consequence,    l'ensemble  des    questions   pertinentes  a  une  carte  cor- 
respond à   l'ensemble  des   questions   pertinentes  au   tableau  des  données   dont 
elle   est   la   transcription.      Et   ce   que  nous  venons  de  voir   pour   les  dia- 
rammes    s'applique  a   tout   problème  cartographique.      Les  deux   questions-test 
permettent   de  définir   le   "comment    faire   la   carte"   par   l'analyse  du   "pourquoi 
faire   la   carte"   c' est-a-dire  par   l'analyse  précise  des   questions  pertinentes 
et   de   leurs  réponses  visuelles. 

1ère   question-test:      quels   sont   les  x   et   les  y  du   tableau   des  données? 

Les  x,    c'est   la  définition  de  l'espace  considère  dans   la  carte. 

Argentine,  USA,  London,  un  quartier,  telle  maison,  tel  objet.  La  forme 
géographique  ou  le  contexte  peuvent  être  suffisants  pour  définir  l'espace 
considéré'.  Mais  ce  n'est  pas  forcément  vrai  pour  tout  le  monde  et  lorsque 
cette  information  manque,  la  carte  perd  toute  signification.  C'est  le  cas 
par  exemple  des  agrandissements  partiels  qui  ne  sont  pas  identifiés  sur  un 
tableau  d'assemblage.  Graphiquement,  ou  verbalement,  il  faut  que  l'espace 
soit   clairement    identifiable,    pour   tout   consultant. 

Les   x,    c'est   aussi   l'échelle   et    la   partition  de  l'espace,    quand    il   y 
a   lieu.      L'information   est-elle  au  niveau   du   département,    de   la   commune, 
de  carres  de   1   km  de  cote,    de   la   parcelle    .     .    .?      Ne   pas   oublier   ces  défin- 
itions. 

Les   y,    c'est    la   définition  des   caractères. 

Dans   la  carte  a  un   seul  caractère,    celui-ci  est  généralement   écrit 
très  visiblement.      A  l'évidence  c'est   le  titre  de  la  carte.      Alors  pourquoi, 
dans   les  cartes  *a   plusieurs  caractères,    ce  même   titre,    car    il    s'agit   bien 
de   la  même  chose,    est-il   généralement    écrit    en   lettres  microscopiques,    en 
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face  des  cartons  de  la  légende? 

La   légende  n'est   pas   autre  chose   que   l'entrée   y  du   tableau  des   données. 
C'est   la    seconde   partie  du    titre.      C'est    le  moyen   indispensable   pour    entrer 
dans   la   carte.      De  même   les   cartons   de   la    légende.      Combien  de   fois    sont-ils 
si  petits   qu'on  ne  distingue   pas   de   quel    "vert"   ou   de   quel    "bistre"    il    s'a- 
git? 

Il   faut   donner   la  visibilité   et   donc    toute   la   place  nécessaire  au 
veritable  titre  de   la   carte,    c' est-a-dire  a    la   légende. 

L'économie  de   place,    ou   de   travail,    que   se   permettent   certains  metteurs 
en  page  ou   graphistes,    pour   qui  l'utilité  de  la  carte   est   le  dernier   des 
soucis,    coûte  en  realite  fort   cher.      En  effet,    ne  pas  pouvoir   reconnaître 
spontanément   les  x   et   les  y  du   tableau  c'est-a-dire  l'espace  représente'  et 
les  caractères  distribues   est   le  meilleur  moyen  d'inviter   le  lecteur   a 
tourner   la  page. 
2e  question-test:      Quels   sont    les  groupements   en  x   et   les  groupments   en  y? 

Cette  question   soulevé  le  problème   spécifique  de   la  cartographie:   une 
carte  a  plusieurs  caractères  ne  peut   pas  repondre  a   la   fois  aux   questions 
élémentaires   et   aux  questions  d'ensemble,    sauf    si  elle  est   très   simplifiée. 
Construire  une  carte  c'est   donc   procedure,    consciemment   ou  non,    a  deux 
choix: 

-  Le  choix  d'un  niveau  de  responses:      responses   élémentaires  ou 
d'ensemble; 

-  Le  choix   entre  une  carte   simplifiée  ou  une  carte  non- simplifiée 
(exhaustive)  . 

Le  niveau  des  responses 

Considérons   la  carte    (12).      Quels   sont   les  groupements   en  y,    c'est- 
a-dire  quels   sont   les  caractères  qui  ont   la  même  "géographie"?     Pas  de 
réponse  visuelle! 

Quels   sont   les  groupements   en  x,    c'est-a-dire  quelles   sont   les 
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La   carte    (12)    est    la   transformation  cartographique  du    tableau    (11). 
Cette  carte  ne  repond    qu'à   la   question   élémentaire   en  x    (13):    a   tel 
endroit,    qu'y   a-t-il?      Les   cartes    (14)    ne  respondent    qu'à   la   question 
d'ensemble   en  y    (15):    tel   caractère,    ou    est-il?      Le  choix   d'un  niveau 
de  réponse   est    inevitable. 
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regions   homogènes    que   les   données   construisent?      Pas  de  re 

La   carte  de   superposition    (12)    repond   aux   questions    élémentaires:      "a 
tel    endroit    qu'y  a-t-il?      "ce   qui  correspond   a    (13)    dans   le  tableau   des 
données.      Mais   elle  ne  repond   pas   aux   questions  d'ensemble.      Cesi   s'explique 
aisément:      peut-on   superposer   de  nombreuses   photographies    sur   une  même  pel- 
licule  et   voir   chacune   séparément?      A  l'évidence   la    superposition  de   plus- 
ieurs   images   détruit   chaque   image   particulière.      La   carte  de   superposition 
de   plusieurs    images   détruit   chaque   image   particulière.      La   carte  de    super- 
position  est   une   "carte  a   lire"   point   par   point. 

Considérons  maintenant  les  quatre  cartes  (14)  .  Quels  sont  les  groupe- 
ments en  y?  La  réponse  est  instantanée:  les  cartes  II,  III  et  T  se  res- 
semblent    entre   elles    et    sont   différentes  de   I. 

Quels    sont    les   groupements    en  x?      Réponses  les   données   construisent 

deux   geographies:      une   France   "agricole"    (I)    et   une   France   "urbaine"    (II, 
III,    T). 

La   collection  de   cartes    (14)    répond   aux   questions   d'ensemble    (16)    par 
l'intermédiaire  de   la   question:      "tel   caractère,    ou    est-il"    (15).      Chaque 
carte   est   une   "carte  a  voir"    instantanément,    ce   qui  permet    de  découvrir 
les   resemblances    et   differences.      Mais   évidemment    la   collection  n'apporte 
pas   de  réponse      instantanée  aux   questions    élémentaires   du   type    (13). 

Cartographier    plusieurs   caractères,    c'est   toujours   choisir    entre  deux 
niveaux   d'information: 


ou 


Niveau   élémentaire:      à   tel    endroit    qu'y  a-t-il?    (13) 
Niveau   d'ensemble:      tel   caractère  où   est-il?    (15) 


En  posant   ces   deux   questions,    tout    lecteur   d'une  carte  peut    immédiate- 
ment  juger   du   niveau   de   l'information   perceptible.      De  même   le  responsable 
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peut   définir    le  niveau   d'information  utile   et    la   formule  graphique  corres- 
pondante: 

Le  niveau    élémentaire   est   fourni  par   la   carte  de   superposition. 
Le  niveau   d'ensemble   est    fourni  par   la   collection  de  cartes  à  un 
caractère. 

L'erreur   a   plus   courante   est   d'ignorer    les   deux  niveaux  de   l'informa- 
tion d'ensemble   qui   est   pertinente. 
1er   cas.      Les   questions   d'ensemble   sont    les   plus   pertinentes 

En  d'autres   termes,    la   carte  doit    fournir  une  réponse    instantane'e  a 
la   question:      tel   caractère  ou   est-il?      L'étude   suivante  nous   en   fournit 
un   exemple   caractéristique.      Un  grand   pays   très  avance      (il    se  reconnaitra 
sûrement)    a   entrepris  une  remarquable   etude   ethnologique:      plus   de   2000 
points   d'enquête,    plus   de  800   types  de  manifestations   folkloriques  multi- 
plies  par    3   dates   possibles,    ce   qui  donne   2400  caractères  x   2000  points 
soit   4.800.000  réponses   oui/non.      Malheureusement   les   questions-test   n'ont 
pas    ete  pose'es.      Les    solutions   habituelles   ont    ete  copiées   et    les   carac- 
tères  ont    été   superposés    sur    la   carte!      Mais   comme   les    superpositions   ont 
des    limites,    le  problème  a    été'  divise.      On  a   construit    25   cartes  de  cha- 
cune environ  32  caractères,    différenciés  par   la   forme,    et  multipliés   par 
3  couleurs  de  dates. 

Quel    est    le  résultat?      De  nombreaux   chercheurs    sont   convies  a   exploiter 
l'enquête.      Leur   demande-t-on   "a   tel    endroit    qu'y  a-t-il?      Evidement   non! 
Par   contre  on   leur   demande   s'il   existe  des   relations   entre   types  de  mani- 
festations,   entre   types    et   dates,    si  certains   groupements   caractérisent 
des   regions,    et    quels   groupements,    et    quelles   régions?      En  d'autres   termes, 
comment    se  regroupent    les  x,    les   y,    du   tableau   des  données? 

Pour   répondre  correctement,    les   chercheurs    sont   obliges   de  redessiner 
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le  800  cartes   par   caractère,    en  dehors   desquelles   tout   commentaire   serait 
sans   fondement.      Il    est    facile  maintenant   de   calculer   le  coût   de  cette   er- 
reur  d'analyse! 

Que   fallait-il    faire?      Certainement    pas   superposer    les   caractères.      Il 
fallait   construire   le   tableau  à   double   entrée   et    faire  apparaître   les 
groupes    ...    a   l'aide  de   l'ordinateur.      Il    est    la   pour   cela.       Il   fournit 
les   groupes,    et   aussi   les    sous-groupes,    et    leur   géographie,    et  même    il   peut 
répondre  aux   questions   élémentaires. 

Mais    si   l'on  n'a   pas   d'ordinateur?      Deux   types   de   solutions: 

(a)  admettre  une   perte  d'information,    procéder    par    sondage   et   réduire 
800  caractères   a  un  plus   petit   nombre,    ou    2000  points  "a   un   plus 
petit   nombre.      On  constate   et    on  démontre   qu'il   est   préférable  de 
réduire   le  nombre  des   points   géographiques.      Un   fichier-image 
dégage  les  groupes   et   permet  de  revenir   ensuite  a   l'information 
exhaustive. 

(b)  conserver    l'information   exhaustive   et,    dès   le  depart,    faire  une 
carte  par   caractère   en  notant    les  dates  par  une  variation  de 
taille.      Hors   l'ordinateur,    cette  solution   est  moins  coûteuse 
globalement    car    elle   supprime   l'infernal   travail   de   separation 
de  chaque  caractère   sur    les   cartes  de   superposition   et   ceci 
répète  800   fois!      Elle   est   donc   plus  utile   que   les    25   cartes, 
spectaculaires   sans  doute,   mais   seulement   pour   ceux   qui  ne   sont 
pas  convies  a   les   exploiter   correctement. 

La   lisibilité.      Les    superpositions   complexes   deviennent    "illisibles". 
Et   pourtant    les  auteurs  des   25  cartes   ethnographiques   étaient   persuadés 
du   contraire!      Porquoi? 

Ces  auteurs  ont   pris   les  meilleures   dispositions   pour    que  chaque   signe 
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soit    "lisible",    c' est-a-dire  pour   qu'il  ne   se   confonde  ni  avec   un  autre 
caractère,    ni  avec    les    signes  voisins.      Le   cartographe  a   résolu   le  prob- 
lème de   la   lisibilité   en  x   dans   le   tableau   des  données:       il   repond   a   la 

s 

question:      a   tel    endroit    qu'y  a-t-il? 

Mais    il   a   oublie   la   lisibilité   en  y:      tel   caractère,    où   est-il?      seule 
lisibilité  capable  de  repondre  a   la   deuxième   question-test.      La   cartogra- 
phie classique,    fille  de   la   topographie,    oublie   souvent    encore   qu'il   y   a 
deux   types  de   questions    et   donc   deux   problèmes   de   lisibilité   qu'on  ne  peut 
résoudre  a   la   fois. 

Construire  un   instrument   de   travail.      Lorsque   la    2e   question-test   doit 
trouver   réponse,    et    lorsque   le  nombre  des   caractères   est   grand,    il   ne   faut 
pas   commencer   par   des   cartes   de   superposition.      Il    faut   construire   l'in- 
strument  de   travail   qui  permettra  de   faire  apparaître  les   corrélations   et 
les   groupes.      La   cartographie   intervient    ensuite   pour    faire  apparaître   la 
repartition  géographique  construite  par   les   corrélations. 

Les    instruments  de   travail    sont   l'ordinateur,    les  matrices   graphiques 
ou   la  collection  de  cartes. 

La   cartographie  n'est   pas   liée  a    la   publication.      La   cartographie   est 
d'abord  un   instrument   de   travail   et    il    est    très    important   de  comprendre 
qu'une   bonne  partie  des   cartes   dessinées  n'est   jamais   publiée.      Le  res- 
ponsable de   la   cartographie  doit    savoir    faire   la   différence   entre 

-  les   documents   de  laboratoire  nécessaires   pour   découvrir   ce   qu'il 
y   a   à  dire,    et    qui  ne   sont   généralement   pas   publiés,    et 

-  les   documents  de  publication,    que   l'on  construit    pour   un   public 
donne  ou    que   l'on  choisit   parmi   les   documents   de   travail,   à   titre 
de  justification.      Ceci   souligne  d'ailleurs,    et    il   faut  malheur- 
eusement   le  dire   encore,    qu'on  ne  "rédige"   pas   le   texte  d'abord    et 
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qu'on   "l'illstre"   après.      Tout   au   contraire,    diagrammes    et   cartes 
sont,    comme   les    traitements  mathématiques,    les   points   de  départ   du 
discours   et    le  discours   n'est    rien  d'autre   que   la   justification  du 
traitement    et   l'interprétation  des   groupements   que   le   chercheur   a 
retenus. 

2e   cas.      Les   questions   élémentaires    sont    les   plus   pertinentes. 

Seules,    les   relations    topographiques   élémentaires   sont   utiles. 

C'est    le  cas   par    exemple  du   plan  d'architecte. 

Le   plan  montre  au  maçon   l'endroit    precis   où    il   doit   construire  une 
"cloison  de  8".      Il  montre   au   plombier    l'endroit    où    il   doit    poser   un 
"robinet   de   15"    ...    Le  plan  définit   des   points   et   des   lignes   par   rapport 
a   de   petits    sous-ensembles    identifiables.      Les  utilisateurs   n'ont   pas   be- 
soin de  voir    l'image  d'ensemble  de   chaque  caractère.      Par   contre    ils   doi- 
vent   trouver   une  réponse  précise   et   complète   à   la   question   "à   tel    endroit 
qu'y  a-t-il?"      Il   faut   donc    superposer    sur    la   carte  tous    les   caractères 
ut  iles. 

Ce   sont   les  cartes   de  repérage.      C'est    le  domaine   -  mais   le   seul   -   des 
signes   conventionnels:      tel    signe   signifie   "robinet",    tel    signe   signifie 
"maison",    tel    signe   signifie   "courbe  de  niveau"    ...    Il    suffit 

(1)  que   l'information   soit    exhaustive,    c' est-a-dire   que   la   carte   con- 
tienne  tous   les    elements   nécessaires  à   un  utilisateur    bien  défini. 

(2)  que   les   elements   ne   se  confondent   ni   en   position  ni   en   significa- 
tion. 

C'est    le  problème  de   la    separation  visuelle  des    elements.      Il   n'est   pas 
facile  a   résoudre  car    il   depend    essentiellement   de   la   complexité  de   la 
distribution.      La  vision  matricielle  des   données   permet   d'analyser    les 
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principaux   paramètres  de  cette  complexité.      Celle-ci  augment 

-  avec    le  nombre  des   caractères    (le  y  du   tableau   des   données) , 

-  avec    le  nombre  des   elements   topographiques    (le  x  du   tableau)    et 
avec   leur    hétérogénéité   en   taille, 

-  avec    le  nombre  des  niveaux   par   caractère    (le  z   du   tableau) , 

-  de   la   juxtaposition  à   la    superposition    (netteté  des    séparations   en  x)  , 
Le   choix   des   caractères    et    leur    separation  visuelle   est   un  problème 

délicat    et    il    suffit   de  reunir   par    exemple  une  collection  de  plans   tour- 
istiques  pour   voir   qu'il   est   rarement   résolu.      Mais   ce  n'est   pas   le   seul 
problème  de   la   cartographie.      Plus    important    est   de   savoir    si  ce  problème 
doit   "être  pose,    ou   non,    si   la    superposition   est   nécessaire  ou   non. 

La   première   erreur    est   de  croire   que  tout    est   carte  de  repérage   et   de 
superposer   plusieurs   caractères    sur   une  même  carte   quand    l'information 
d'ensemble   est   nécessaire.      L'exemple  des   cartes   folkloriques  montre   le 
coût   de   telles   erreurs.      Mais   cela   devient   grave   quand    les   "décideurs" 
n'ont   que  des   cartes   de   superposition  pour   orienter    leurs   décisions.      Ils 
ne  voient    que   quelques    informations   élémentaires   et    il   leur    est    impossible 
de  voir    si  ces    éléments    sont    significatifs   ou,    au   contraire,    si   ce   sont 
des   exceptions   par   rapport   a   la   tendance  générale.      Certains   documents 
officiels  ne  contiennenent    que  des   cartogrammes. *      Or   un  cartogramme   ex- 
clut   toute    information  d'ensemble.      On  peut   donc    se  demander   avec    inquiet- 
ude   sur    quelles   bases   les   decisions   ont    ete   prises. 

Mais   la   deuxième   erreur    est   d'oublier   de   faire  des  cartes   de  repérage 
quand    elles   sont   nécessaires.      Combien  de   livres   d'histoire,    de  géographie, 
d'archéologie,    de   sciences  ne  donnent    pas   la   carte   indispensable  pour 


*Cartogramme:      superposition  de  n  caractères   par    la   construction  de  dia- 
grammes  du    type    (1),    (10),    (12),    (18)    etc.    .    .    .,    disperses    sur   la 
carte. 
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suivre   l'auteur   dans   son  discours.      Un  auteur    écrit   même,    dans  un   récit   de 
bataille,    qu'il   n'en  donne   pas   parce   qu'elles   ennuient    le   lecteur!      Mais 
heureusement   de   très   nombreuses   series   existent:      topographie,    cartes 
routières,    cartes   de  vegetation,    géologie,    climat,   morphologie,    etc.     .     .    . 
Elles   répondent  "a    la    lecture   élémentaire,    et    quelquefois   à    la    lecture  d' 
ensemble   quand    les   caractères   zonaux  ne   font    que   se   partager    le   plan    (géo- 
logie par    exemple) . 

On   constate  néanmoins   que   l'accumulation   de   l'information   et    surtout    la 
nécessite  d'introduire  dans   un   problème  de   très   nombreux   caractères   re- 
mettent   en   cause   bien  des    series   cartographiques.      On   s'oriente   plus  vo- 
lontiers vers  une  cartographie   d'intervention  régionale,    capable  de  pren- 
dre  en  compte   tous   les   caractères    souhaitables    en  utilisant    l'ordinateur, 
la  collection  de  cartes. 

3e   cas.      Les   deux   niveaux   de   questions    sont    pertinents 

C'est  le  cas  par  exemple  des  atlas  géographiques  nationaux  ou  région- 
aux. Les  consultants  sont  nombreaux.  Certains  demandent  "à  tel  endroit, 
qu'y  a-t-il?"      Certains   demandent    "tel   caractère,    où    est-il?" 

Pourquoi  négliger   ces   derniers?      Pourquoi    ignorer   ceux   qui   ont    besoin 
de   comparer    les   caractères    les   plus  divers,    de  découvrir    la   géographie 
correspondant   à    leur   problème,    a    leur    propre   tableau   des   données?      Pour 
eux,    les   cartes   du   type    (12)    ou    (18)    par    exemple   sont    parfaitement    inutiles, 


Lorsque   les   deux   niveaux   de   questions    sont   pertinents    il   n'y  a    qu'une 
solution  cartographique:      faire  plusieurs   cartes: 

1   -   La   carte  de   superposition,    pour   repondre  à    la    question   "à   tel    endroit 


2   -   Une   carte  par   caractère,    pour   répondre  a    la    question   "tel   caractère 
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Cette   solution   est   facile   car   les   cartes   par   caractère  peuvent    être 
beaucoup   plus   petites   que   la   carte  de   superposition.      Elles  n'ont   pas   be- 
soin de  couleur  mais    elles   doivent   comporter   un   système  de  repérage   spa- 
tial  discret    (quadrillage  par    exemple)    suffisamment    serre  pour    faciliter 
les  comparaisons   precises. 
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Carte   simplifiée  ou   carte  non-simpllf iee    (exhaustive) 

La   carte   schématique   que   l'on  rencontre    souvent,    non   seulement    dans 
les   livres    scolaires  mais   aussi  dans   des   documents   d'information   est  ut- 
ile mais   il   faut   bien  voir. 

1  -   qu'elle  ne  peut   remplacer    l'information  de  depart.      La   carte    (20)    ne 
permet   pas   de  reconstituer    les   données   exhaustives    (19).      Lorsque   celles- 
cir   peuvent    être  utiles   pour   d'autres   comparaisons,    la   carte   simplifiée 
fait   faillite. 

2  -   qu'elle   est    toujours   discutable   soit    dans   le   choix   des   caractères  mis 
en   evidence,    soit   dans   le  choix   du   niveau  de   simplification. 

Qu'est-ce   que   la    simplification? 

La   carte   simplifiée   correspond   à   la   réduction  du   tableau   des   données. 
Elle   transcrit   géographiquement    les   groupements   en  x    (regions)    définis 
par    les   groupements   en  y    (caractères)    que   les   données   z   construisent. 
La   découverte  de  ces   groupements   peut    se   faire  par    l'intermédiare  de   la 
collection  de  carte  ou   par    l'intermédiaire   des  manipulations  matricielles , 

Soit    la   carte  classique  des   pyramides   des   âges    (18).      Elle   transcrit 
le  tableau    (17).      Comme  tout   cartogramme,    elle  détruit   les   "geographies". 
Pour   voir   celles-ci,    il   faut   construire 

-  soit    la   collection    (19)    qui  conduit    a   la   carte   simplifiée    (20) 

-  soit    la  matrice    (22)    qui   se  regroupe   aisément    (24)    et    fournit    la 
carte  simplifiée   (23). 

Mais    (20)    et    (23)    ne  permettent   pas  de  reconstituer   les  données  de 
depart.      Celles-ci  sont   perdues   si  l'on  ne  fournit   pas    (18),    (19)    ou    (24). 
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Une  simplification  est  toujours  discutable 

La  carte  (20)  par  exemple  est  discutable.   En  effet,  pourquoi  pas  (23) 
ou  (25)  ou  (26)  ou  (28)? 

(23)    définit   7    catégories   de  regions,    des   plus  jeunes   aux   plus   agees,    plus 
une   catégorie   exceptionnelle  définie   par    la   présence   a   la   fois   de  jeunes 
et   des  vieux. 

(25)  ne   définit    que   3   régions   et    la   catégorie   exceptionnelle. 

(26)  définit   deux    "systèmes":      regions   jeunes   ou  vielles    en    5   catégories, 
régions   de   classes    extrêmes  ou   centrales. 

(28)    définit    les  mêmes    sytèmes,   mais   avec    deux   categories    seulement   dans 
chaque    système. 

Toute   simplification   est   une    interpretation  particulière.      Faut-il 
l'imposer   a    l'utilisateur    sans   la   justifier,    sans   donner    les  moyens   d'en 
faire   la   critique?      De  plus    il   peut    être  utile  de   connaître  ce   qui   dif- 
férencie deux   regions   regroupées   dans    la   carte   schématique.      La   rigueur 
scientifique  conduit   donc   a   donner    l'information   exhaustive   en  plus   du 
schéma,    c' est-a-dire  a   donner    soit    23    et    24    soit    19   et    20. 

(24)    ou    (27),    comme    (19)    contiennent    tous   les    elements   d'autres    inter- 
prétations  possibles,    adaptées   au   problème   spécifique  de   chaque  utilisa- 
teur . 

Cet    exemple   se   transpose  aisément    a   tout    problème  de   simplification 
cartographique. 

La    "généralisation   cartographique"    est   dune  opération  matricielle. 

La    "généralisation"    est    la    simplification  nécessaire,    en   particulier 
quand   on  réduit    l'échelle   de   la   carte.      Elle  conduit   a    supprimer   des   charac- 
teres   ou   a   les   regrouper.      Mais   comment? 

La   transformation  matricielle  montre   que   la   "généralisation"    est   de 
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même  nature   que   la    simplification,    par   permutations   et   regroupements,    du 
tableau   des  données   correspondant   a   la   carte.      Cette  observations   ouvre 
de  nombreuses   perspectives:      conserver   ou    supprimer   une   indentation  dans 
une   courbe  de  niveau   par    exemple  ne  depend   pas    seulement   des   dimensions 
de  cette   indentation.      La   decision  dépend    surtout   de   la  variation  des  val- 
eurs de   tous   les   points    (x)    qui   l'entourent*   pour    tous   les   caractères    (y) 
choisis. 

Or   chacun   peut   défendre   sa   propre   liste  de  caractères!      La   generali- 
sation  est   donc   un  problème  de  choix  des   caractères   plus   qu'un  problème 
de  méthode.      Mais   une  fois   les   caractères  choisis,    la   conception  matri- 
cielle du   problème  permet   d'envisager   un   système  automatique  de  generali- 
sation. 

En  resume 


Dans   la   cartographie,    chaque  caractère  occupe   toute   l'image.      En   con- 
sequence  les  deux   questions-test    se  complètent   ainsi: 

1  -   Quels    sont    les  x,    les   y  du   tableau   des   données? 
Mais  de  plus,    la   carte   est-elle 

-  non-simplif iee    (elle   transcrit   toutes   les   données   du   tableau) 

-  ou    est-elle   simplifiée? 

2  -   Comment    se   regroupent    les   x,    les  y   du   tableau   des   données? 
Mais   de  plus,    quelle   est    la   questions   pertinente: 

-  la   question   en   y:    tel   caractère,    ou    est-il    (geographies   d'ensemble)    ou 

-  la   question   en  x :    a   tel    endroit,    qu'y  a-t-il    (information   ponctuelle)? 

Tout    lecteur    et    tout   responsable   peut    se    servir   de  ces   deux  alterna- 

i  '  /  .  -i 

tives   pour   juger   des   réponses   fournies   par   une   carte,    ou   pour   éviter    les 


*La   "téle'-détection"   repose   sur    ce  principe. 
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principales    erreurs. 

1ère   erreur   -      Ne   pas  mettre   en  vedette   les   x    et    y   du   tableau. 
Le   lecteur   doit   reconnaître  aisément    l'espace  représente,    ce   qui   est    en 
général   facile.      Mais    il   doit    aussi  reconnaître    instatanement    les   carac- 
tères  distribués.      Et    ici,    il   y   a   encore   beaucoup   à   faire. 

2eme   erreur   -   Superposer   plusieurs   caractères,    quand    la    lecture  d'ensemble 
est  utile. 

C'est   par    exemple   l'erreur   des   cartes    folkloriques.      C'est    l'erreur 
de   tous    les   cartogrammes.      L'information  n'est   pas    simplifiée;    elle   est 
complete    (exhaustive) .      Mais    il    est    impossible  de  voir    la    "géographie" 
que   les   caractères   construisent,    ni  d'isoler    tel   caractère  pour    les   be- 
soins  de  comparaisons  variées.      La   question   "tel   caractère  ou    est-il?" 
n'a   pas   de   réponse. 
3  e   erreur   -   Faire  une  carte   simplifiée   quand    1' exhaust  iv  it  e   est   nécessaire, 

C'est    l'erreur   des   cartes    schématiques   non  accompagnées   de   leur   jus- 
tification.     Toute   simplification   est   discutable.      Elle  ne  remplace  pas 
les   données    exhaustives. 

Pour   les    erreurs    2    et    3    la    solution   générale   est   de  donner    en  plus   la 
carte   séparée  de  chaque  caractère.      Ces   cartes   peuvent    être   très   petites 
tout    en  restant    exhaustives.      Et    elles   peuvent    être  monochromes,    le  noir 
assurant    la  meilleure   séparation.      On   peut   aussi  fournir    s'il   y   a   lieu 
le   traitement   mathématique   ou   graphique  correspondant.      Mais   l'informa- 
tion  de   base  n'est    fournie   que  par    la  matrice  des   données. 

Conclusion 

Tout   problème   peut    être   posé'  sous   la   forme  d'un  reseau   de  relations 
ou    sous   la   forme  d'une  matrice  de  relations.      Mais  un   réseau   devient 
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rapidement    illisible   tandis   que   la  matrice  visuelle   permet   d'aller    spon- 
tanément  du   détail  à   l'ensemble   et   de   l'ensemble  au   détail,    tout    en  ac- 
ceptant  un  nombre    important   de  données.      La   graphique  a   pour   but   d'utiliser 
cette   propriété' de   la   perception  visuelle  pour  mieux  comprendre   et  mieux 
decider . 

Utiliser   cette   propriété   en  cartographie   et   concevoir   toute  carte 
comme   la   transformation  d'une  matrice  de  données   permet   d'aborder    la   théo- 
rie générale  de   la   cartographie  d'une  manière  precise   et    concise. 

Cette   transformation   souligne   que   la   carte,    comme   tout   reseau,    devient 
facilement    illisible.      Le  detail   détruit    l'ensemble   et   le   chois    entre   les 
deux   est    inévitable. 

Cette   transformation   éclaire   les   choix   en   fournissant   une  analyse  com- 
plete  du    "pourquoi"?      (questions   pertinentes)    qui  permet   de  définir   le 
"comment?"      Elle  montre   qu'on  ne   "lit"   pas  une  carte,    pas   plus   qu'on  ne 
"lit"   un  diagramme.      On   lui  pose  des    questions.      Tout   utilisateur   doit 
donc   apprendre   la   hiérarchie  des    questions   pertinentes. 

Cette  transformation   évite   enfin   les   analyses   qui   confondent   choix 
des  données    et    transcription   des   données.      Il   n'y  a   pas  un  monde  du    "réel" 
et    sa   transcription  plus   ou  moins  aléatoire.      Il   y   a  un   tableau   fini   de 
données  x,    y,    à    transcrire   cartographiquement . 

Or    le  choix   des   données  x,    y,    sera   toujours   parfaitement    libre;    ce 
choix   de  même   que   l'interprétation  des  résultats   du    traitement,    est    le 
problème  du   géologue,    de   l'historien,    du   géographe,    du   pédagogue,    du 
médecin    ...    Ce  n'est   pas   le   problème  du   cartographe. 

Par   contre   la   transcription   et    le   traitement   des  données    sera   tou- 
jours  tributaire   des   lois   de   la   logique   et   de   la   perception  visuelle. 
C'est    le  produ  mathématicien,    du   graphicien   et   du   cartographe. 
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A  chacun   son  role.      Et    si   le  même    individu   joue   les   deux   rôles,    il 
doit    savoir    les    séparer    et   apprendre   les   deux   textes.       Il   doit    connaître 
sa   discipline.      Mais    il   doit   aussi  connaître   la   logique  visuelle.      Les 
deux   questions-test    en   sont    l'introduction   essentielle. 
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